Learn Before
Rationale for Two-Stage Inference Computation
In a system that generates text based on user input, the computational process is often divided into two main stages: one for processing the initial input all at once, and another for generating each subsequent piece of the output sequentially. Explain the fundamental reason for this separation and describe the key computational difference between these two stages.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Inference Engine Optimization
An LLM system receives a long user prompt: 'Summarize the following article about renewable energy... [article text]'. The system processes this entire block of text in a single, parallel computation to prepare for generating the first word of the summary. Which specific stage of the inference process does this action represent?
A system that generates text processes user input in two distinct computational stages. Match each stage with its primary characteristic and function.
Rationale for Two-Stage Inference Computation