1Cademy - Outcome-Based Reward Models

Learn Before

Human Preference Alignment via Reward Models

Concept

Outcome-Based Reward Models

Reward models can be designed to evaluate entire input-output sequences, a method that is highly effective for tasks where correctness can be verified by simply checking the final result. This approach focuses on the outcome rather than the process used to achieve it.

Updated 2025-10-07

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn After

Example of an Outcome-Based Reward Model in Mathematics
Insufficiency of Outcome-Based Rewards for Complex Reasoning
A company is training a language model to act as an automated assistant for processing loan applications. The model must follow a specific, legally-mandated, multi-step procedure to ensure fairness and compliance (e.g., checking credit history, verifying income, providing specific disclosures). The company decides to train the model using a system that provides a large positive reward only if the final loan decision (approve/deny) is correct based on the applicant's overall profile. What is the
Evaluating Reward Model Suitability
Reward Model Suitability for a Creative Task

Learn Before

Related

Learn After