Learn Before
Evaluating Prompting Strategies for Scalable Inference
A development team is building a service to summarize millions of customer support tickets daily. Engineer A advocates for using a detailed, 500-word text-based prompt that provides extensive instructions and examples to the language model for every ticket. Engineer B proposes investing initial compute resources to train a low-dimensional, optimized prompt representation that is not human-readable but is tailored for this specific summarization task. Evaluate the two proposed approaches strictly from the perspective of long-term computational efficiency and resource consumption once the service is deployed at a massive scale. Which approach is more suitable and why?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Practical Application of Soft Prompts in Repetitive Tasks
A software company is developing a feature to classify millions of user-generated comments per day into one of ten categories using a large language model. The primary constraints for this system are minimizing operational costs and ensuring high throughput (fast processing time for each comment). Which of the following prompting strategies should the development team choose to best meet these requirements?
Evaluating Prompting Strategies for Scalable Inference
Explaining Computational Performance in Prompting