Case Study

Evaluating AI Reasoning Strategies

You are evaluating two reasoning paths generated by an AI for the same problem. A scoring system calculates the total reward for a path by summing the log-probabilities of each step being 'correct', as determined by an automated checker. Based on the step-level probabilities provided in the case study, which path would this scoring system prefer? Justify your answer by explaining how this scoring method treats paths with varying levels of certainty across their steps.

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science