Learn Before
Evaluating Context Handling in Language Models
A language model is designed to generate text one token at a time, where each new token is predicted based on the sequence of tokens that came before it. One common architectural approach requires the model to have access to the complete history of all previously generated tokens for every single new prediction. Analyze the primary advantage and the primary disadvantage of this 'full-context' approach.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Key-Value (KV) Cache in Transformer Inference
A language model using a standard Transformer architecture is generating a long sequence of text one token at a time. How does the computational effort required to generate the 500th token compare to the effort required for the 10th token?
Diagnosing Memory Issues in a Language Model
Difficulty of Training Transformers on Long Sequences
Evaluating Context Handling in Language Models
Explicit Context Encoding via Additional Memory Models