Learn Before
A development team creates a prompt for a text summarization task and meticulously refines it for 'Language Model Alpha', achieving a 95% performance score. When they deploy the exact same prompt on 'Language Model Beta', a different model of comparable overall capability, the performance score drops to 70%. What is the most likely reason for this discrepancy?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Evaluating a Prompt Development Strategy
A development team creates a prompt for a text summarization task and meticulously refines it for 'Language Model Alpha', achieving a 95% performance score. When they deploy the exact same prompt on 'Language Model Beta', a different model of comparable overall capability, the performance score drops to 70%. What is the most likely reason for this discrepancy?
Model-Dependent Prompt Performance