Learn Before
A machine learning engineer observes that the initial processing of a user's prompt in a large language model takes a significant amount of time, but subsequent token generation is much faster per token. Based on this observation, which statement best analyzes the primary function of this initial processing phase (prefilling)?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Comparison of Prefilling and BERT Encoding
A machine learning engineer observes that the initial processing of a user's prompt in a large language model takes a significant amount of time, but subsequent token generation is much faster per token. Based on this observation, which statement best analyzes the primary function of this initial processing phase (prefilling)?
Objectives of Inference Phases
The main goal of the prefilling phase in a generative language model is to generate the first token of the model's response, while the computation of the input sequence's contextual representation is a secondary effect of this process.