Learn Before
Memory-Compute Trade-off in Constrained Environments
An engineer is deploying a large language model on a device with a powerful processor but very limited memory. To handle long text inputs without running out of memory, they decide to implement a strategy that involves re-calculating some intermediate values during processing instead of storing them all. Explain the fundamental trade-off this engineer is making and why it is a suitable choice for this specific hardware constraint.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Chunked and Windowed Attention
An engineer is deploying a large language model for a task that requires processing very long sequences of text. During testing, they observe that the system's memory usage grows linearly with the length of the input sequence, eventually causing the system to run out of memory and fail. Which of the following strategies correctly identifies the underlying trade-off to mitigate this specific memory issue?
Optimizing a Document Summarization Service
Memory-Compute Trade-off in Constrained Environments