Learn Before
Evaluating an AI-Generated Travel Plan
You are a quality evaluator for an AI assistant. Your task is to assess the AI's reasoning by examining the validity of each individual step, rather than just the overall feasibility of the final plan. Based on this step-by-step evaluation method, analyze the provided AI response. Identify the specific step where the reasoning fails and explain why the subsequent steps are also flawed as a result.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.5 Inference - Foundations of Large Language Models
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Step-Level Search with Verifiers
An AI model generates a step-by-step solution to a complex math problem. The model's final answer is correct. However, upon review, it is discovered that an intermediate calculation step contains a logical error, but a subsequent error coincidentally corrected the mistake, leading to the right final number. If an evaluator's goal is to reward sound reasoning by assessing the validity of each individual step, how would this solution be scored?
Evaluating AI Reasoning for Tutoring
Evaluating an AI-Generated Travel Plan