Diagnosing LLM Performance Bottlenecks
Based on the provided scenario, what is the most likely underlying cause of the 'out-of-memory' errors, and what general strategy should the research lab investigate to resolve this issue without sacrificing the model's ability to consider the entire novel at once?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
LLM Architecture Selection for a Legal Tech Application
A development team is building a language model based on the standard Transformer architecture to summarize lengthy legal documents, often exceeding 10,000 tokens. They observe that the model's memory usage grows quadratically with the input length, leading to out-of-memory errors. Which of the following architectural modifications most directly targets the root cause of this specific memory issue?
Diagnosing LLM Performance Bottlenecks