Learn Before
When a language model generates a response, it first processes the user's entire input prompt and then generates the output one token at a time. How does the computational approach for these two phases typically differ in terms of how tokens are handled?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
When a language model generates a response, it first processes the user's entire input prompt and then generates the output one token at a time. How does the computational approach for these two phases typically differ in terms of how tokens are handled?
When a language model is given an initial text prompt, it processes the tokens of that prompt one by one, in the order they appear, before it starts generating a response.
Processing Asymmetry in Text Generation