Learn Before
Example

Example of an Outcome-Based Reward Model in Mathematics

A practical application of an outcome-based reward model is in evaluating mathematical calculations. In this scenario, the model provides positive feedback for a correct final answer and negative feedback for an incorrect one, without assessing the intermediate steps of the calculation.

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences