Short Answer

Nuanced Evaluation of Reasoning Paths

Imagine a system that evaluates multi-step reasoning by summing the log-probabilities of each step being 'correct'. Consider two reasoning paths, Path X and Path Y, that both solve a problem and have the same number of steps. In Path X, every step is evaluated with a moderately high probability of being correct (e.g., 80% for each step). In Path Y, most steps are evaluated with near-certainty of being correct (e.g., 99%), but one step is evaluated with a very low probability of being correct (e.g., 10%). Which path is likely to receive a higher total score from this evaluation system, and why? Explain your reasoning based on the properties of logarithms.

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science