Formula for Scoring Reasoning Paths by Counting Correct Steps
A simple method to score a reasoning path is to count the number of steps classified as 'correct'. This can be represented by the formula:
where:
- is the total reward for the reasoning path given the input .
- is the total number of steps in the reasoning path.
- is the classification output for step , determined by selecting the label with the maximum probability.
- is the Kronecker delta function, which is if and otherwise. In this context, it equals if the classification for step is 'correct'.

0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.4 Alignment - Foundations of Large Language Models
Related
Formula for Scoring Reasoning Paths by Counting Correct Steps
Scoring an AI's Reasoning Process
An AI's multi-step solution to a complex problem is evaluated by a separate model that classifies each step as either 'correct' or 'incorrect'. The final quality score for the entire solution is calculated by summing the total number of steps classified as 'correct'. What is a primary conceptual limitation of this evaluation approach?
Calculating a Reasoning Path Score
LLM Prediction with Full Context
LLM Prediction with Compressed Context
Mathematical Formulation of Prompt Ensembling
Formula for Scoring Reasoning Paths by Counting Correct Steps
A classification model is given an input,
x, and must choose an output,y, from the set of possible classes {A, B, C, D}. The model's decision rule is to select the class that has the highest conditional probability,Pr(y|x). Given the following probabilities calculated by the model for the inputx, what will its final prediction be?Pr(y=A | x)= 0.15Pr(y=B | x)= 0.55Pr(y=C | x)= 0.25Pr(y=D | x)= 0.05
Model Prediction vs. Ground Truth
Analyzing a Model's Prediction Choice
Learn After
Calculating Reasoning Path Score
Consider two reasoning paths, Path A and Path B, generated to solve the same problem. Path A consists of 3 steps, all of which are classified as 'correct'. Path B consists of 5 steps, where 4 are classified as 'correct' and 1 is 'incorrect'. According to the scoring formula , which of the following statements is true?
Critique of a Reasoning Path Scoring Method