Case Study

Evaluating Prompting Strategies for Scalable Inference

A development team is building a service to summarize millions of customer support tickets daily. Engineer A advocates for using a detailed, 500-word text-based prompt that provides extensive instructions and examples to the language model for every ticket. Engineer B proposes investing initial compute resources to train a low-dimensional, optimized prompt representation that is not human-readable but is tailored for this specific summarization task. Evaluate the two proposed approaches strictly from the perspective of long-term computational efficiency and resource consumption once the service is deployed at a massive scale. Which approach is more suitable and why?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science