Learn Before
Learned vs. Heuristic Weights for Memory Summarization
A language model uses a weighted moving average to create a summary of past information for its memory component. The weights used in this average can either be learned as part of the model's training process or set using a pre-defined heuristic (e.g., giving more weight to recent information). Analyze the potential advantages and disadvantages of each approach (learned vs. heuristic) for determining these weights.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Formula for Memory as a Weighted Moving Average of Keys and Values
Increasing Coefficients as a Heuristic for Weighted Moving Average
A language model's memory component creates a summary vector of past information using a weighted moving average. The weights are determined by a heuristic that assigns significantly higher importance to more recent information. For a task like summarizing a long, complex article, what is the most probable impact of this specific weighting scheme on the model's output?
Learned vs. Heuristic Weights for Memory Summarization
Configuring Memory for Narrative Coherence