Learn Before
Diagnosing Inference Latency
Based on the following observation, what specific operation at the beginning of the text generation process is likely responsible for the initial, input-length-dependent delay, and what is its fundamental purpose?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Prefilling Phase in Transformer Inference
A user provides the following sequence of words to a large language model: 'Write a short story about a robot who discovers music.' In the model's text generation process, what is the primary role of this initial sequence of words?
Diagnosing Inference Latency
The Role of the Initial Input Sequence