Learn Before
Consider two methods for scoring a multi-step reasoning process generated by an AI. Both methods use an underlying model that, for each step, outputs a probability that the step is 'correct'.
- Method A: Assigns a score of +1 to each step where the probability of being 'correct' is greater than 0.5. The total score is the sum of these step scores.
- Method B: Calculates the total score by summing the logarithm of the 'correct' probability for every step in the process.
Now, analyze two reasoning paths for the same problem:
- Path 1: Consists of 3 steps, each with a 'correct' probability of 0.9.
- Path 2: Consists of 3 steps, each with a 'correct' probability of 0.6.
Which statement accurately compares how these two methods would score the paths?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Formula for Log-Probability-Based Reward for Reasoning Paths
Consider two methods for scoring a multi-step reasoning process generated by an AI. Both methods use an underlying model that, for each step, outputs a probability that the step is 'correct'.
- Method A: Assigns a score of +1 to each step where the probability of being 'correct' is greater than 0.5. The total score is the sum of these step scores.
- Method B: Calculates the total score by summing the logarithm of the 'correct' probability for every step in the process.
Now, analyze two reasoning paths for the same problem:
- Path 1: Consists of 3 steps, each with a 'correct' probability of 0.9.
- Path 2: Consists of 3 steps, each with a 'correct' probability of 0.6.
Which statement accurately compares how these two methods would score the paths?
Evaluating AI Reasoning Strategies
Nuanced Evaluation of Reasoning Paths