Activity (Process)

Iterative Refinement for LLM Reasoning

This training-based scaling method involves a cyclical process to enhance a Large Language Model's reasoning abilities. Initially, the LLM generates solutions and their corresponding reasoning paths for a given set of problems. These outputs are then evaluated by either human reviewers or automated verifiers. Only the correctly reasoned paths are selected and added to the training dataset. The LLM is then fine-tuned on this newly augmented data. This loop of generation, verification, and retraining progressively improves the model's intrinsic capacity for reasoning.

0

1

Updated 2026-05-06

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences