Evaluating a Prompt Engineering Strategy
Two teams are developing applications using language models. Team A is using a very large, state-of-the-art model and is getting good results with simple, direct instructions. Team B is using a smaller, open-source model and is struggling to get consistent outputs. Team B's lead argues, 'The problem isn't our model; it's that we haven't found the 'perfect' universal prompt yet. A truly well-crafted prompt should work on any model.' Based on the relationship between a prompt's design and a model's underlying abilities, evaluate the soundness of Team B's argument and justify your conclusion.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Comparison of Prompting Strong vs. Weak LLMs
A developer designs a prompt for a task and finds it works exceptionally well with a large, state-of-the-art language model. However, when the same prompt is used with a smaller, less powerful model, the results are significantly worse. To achieve a similar quality of output from the smaller model, the prompt needs to be made much more detailed and explicit. What fundamental principle about interacting with language models does this situation demonstrate?
LLM Selection and Prompt Strategy
Evaluating a Prompt Engineering Strategy