1Cademy - An AI model generates a step-by-step solution to a complex math problem. The models final answer is correct. However, upon review, it is discovered that an intermediate calculation step contains a logical error, but a subsequent error coincidentally corrected the mistake, leading to the right final number. If an evaluators goal is to reward sound reasoning by assessing the validity of each individual step, how would this solution be scored?

Learn Before

Process-Based Verification

Multiple Choice

An AI model generates a step-by-step solution to a complex math problem. The model's final answer is correct. However, upon review, it is discovered that an intermediate calculation step contains a logical error, but a subsequent error coincidentally corrected the mistake, leading to the right final number. If an evaluator's goal is to reward sound reasoning by assessing the validity of each individual step, how would this solution be scored?

Updated 2025-10-01

Contributors are:

Who are from:

Learn Before

Related