Diagnosing Few-Shot Learning Failures
A developer is using a language model to classify customer feedback. When they test the model with the new feedback, 'This is the best app I have ever used!', the model incorrectly classifies it as 'Negative'. Based on the provided case study, explain the most likely reason for this specific failure and what single change to the set of examples would be most effective at fixing it.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A developer is trying to get a language model to classify short movie reviews as 'Positive', 'Negative', or 'Neutral'. They test two different sets of instructions, shown below.
Instructions A: Classify the following movie review. Review: The plot was predictable and the acting was wooden. Classification: Negative
Review: This film was an absolute masterpiece from start to finish. Classification:
Instructions B: Classify the following movie reviews. Review: The plot was predictable and the acting was wooden. Classification: Negative
Review: It wasn't a bad movie, but it wasn't particularly memorable either. Classification: Neutral
Review: This film was an absolute masterpiece from start to finish. Classification: Positive
Review: I have seen better, but it was an enjoyable way to spend an afternoon. Classification:
Why are 'Instructions B' significantly more likely to lead to a correct and reliable classification for the final review compared to 'Instructions A'?
Diagnosing Few-Shot Learning Failures
Evaluating Demonstration Sufficiency in a Prompt