Learn Before
Comparing Outcome-Based and Process-Based Evaluations of Math Responses
When evaluating an AI's responses to a math problem, an outcome-based approach treats any response with the correct final result as entirely correct, ignoring any flaws in the intermediate reasoning. In contrast, a process-based approach assesses the correctness of each individual step, allowing it to identify and account for mistakes made during the reasoning process even if the final answer is coincidentally correct. This detailed step-level evaluation is essential for effectively guiding the model's logic through reward modeling.
0
1
Tags
Foundations of Large Language Models
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Diagnosing Flawed Reasoning in Language Models
A team is training a language model to act as a programming assistant that generates code. They observe that the model sometimes produces functionally correct code (the outcome is right) but uses inefficient, non-standard, or difficult-to-maintain methods (the process is poor). Which of the following feedback strategies would be most effective at specifically improving the quality of the reasoning process, rather than just the correctness of the final output?
A research team is developing a large language model for different tasks. Match each training objective with the most appropriate feedback strategy.
Comparing Outcome-Based and Process-Based Evaluations of Math Responses