Learn Before
Evaluating Trade-offs in LLM Deployment
A financial services company is developing two applications: a real-time customer service chatbot and an overnight batch-processing system for summarizing daily market reports. They have two models to choose from:
- Model A: State-of-the-art accuracy, but consumes a high amount of electrical power per inference.
- Model B: Slightly lower accuracy than Model A, but is significantly more energy-efficient.
Evaluate which model would be more suitable for each application. Justify your recommendations by explaining the trade-offs between model performance and energy consumption in the context of each specific use case.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A company is deploying a text-generation model and needs to choose the most energy-efficient hardware configuration. Their goal is to maximize the number of text generations for every unit of energy consumed. They test two options:
- Configuration X: Generates 200 text completions per minute and consumes 500 watts of power.
- Configuration Y: Generates 150 text completions per minute and consumes 250 watts of power.
Based on the stated goal, which configuration is the better choice and why?
Diagnosing High Energy Costs in LLM Deployment
Evaluating Trade-offs in LLM Deployment