Analyzing Prompt Design for LLM Evaluation
Analyze why the prompt 'Please calculate the average of the numbers 2, 4, and 9' is considered a test of a Large Language Model's ability to perform a direct mathematical operation, rather than its complex reasoning skills. In your analysis, create a contrasting prompt that would be designed to test the model's step-by-step reasoning for the same calculation and explain why your new prompt is different.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Step-by-Step Calculation of the Average of 2, 4, and 9
LLM's Answer (7) to the Prompt for Calculating the Average of 2, 4, and 9
A researcher wants to test a language model's ability to perform a standard mathematical operation directly, without guiding it through intermediate reasoning steps. Which of the following prompts is best designed to achieve this specific goal for the numbers 2, 4, and 9?
Evaluating a Prompt for Foundational Skill Assessment
Analyzing Prompt Design for LLM Evaluation
LLM's Answer () to the Prompt for Calculating the Average of , , and