Learn Before
A development team is tasked with deploying a large language model on a fleet of smartphones, which have strict memory limitations. To achieve this, they apply a technique that reduces the numerical precision of the model's parameters, thereby decreasing its overall size. What is the most likely and direct trade-off the team must evaluate when implementing this change?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Architectural Modification for Long Sequence Processing
Model Compression for LLM Inference
LLM Deployment Strategy for Mobile Devices
A development team is tasked with deploying a large language model on a fleet of smartphones, which have strict memory limitations. To achieve this, they apply a technique that reduces the numerical precision of the model's parameters, thereby decreasing its overall size. What is the most likely and direct trade-off the team must evaluate when implementing this change?
An engineering team observes that their large language model's memory consumption is acceptable for short user inputs, but it grows excessively and becomes unmanageable as the length of the input text increases. Which of the following statements best diagnoses the underlying issue that a memory reduction technique would need to address in this specific scenario?