Recurrent Update for Memory Caching
A fixed-size memory cache can be maintained through a recurrent update mechanism, as depicted by a recurrent network acting as a cache. At each time step i, the new key-value pair, denoted as S_kv, is combined with the memory state from the previous step, Mem_pre. An Update function processes these two inputs to compute the new, compressed memory state, Mem. This approach allows the model to summarize an arbitrarily long history of key-value pairs into a constant-size memory representation, such as a single key and value pair (Size = 1 x 2).

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Moving Average of Keys and Values for Memory Component
Weighted Moving Average for Memory Component
Cumulative Average of Keys and Values for Memory Component
An engineer is designing a language model that must process very long sequences while keeping the computational cost of attention constant at each step. They are considering two approaches for the model's memory component:
- Approach 1: The memory stores the raw key-value pairs from the 256 most recent positions in the sequence.
- Approach 2: The memory is a pair of fixed-size 'summary' vectors, which are calculated by mathematically combining all preceding key-value pairs into a single, condensed representation.
Which statement best analyzes the primary trade-off between these two approaches?
Memory Representation in Attention Mechanisms
Recurrent Update for Memory Caching
Optimizing Memory for Long-Sequence Processing
Learn After
General Formula for Recurrent Memory Update
Cumulative Average of Keys and Values for Memory Component
Recurrent Network as a Cache Mechanism
A system is designed to process an extremely long, continuous sequence of information. To manage this, it uses a memory cache that is updated at each step: a new key-value pair is combined with the entire compressed memory from the previous step to form a new, equally compressed memory state. What is the primary trade-off inherent in this design?
A system maintains a fixed-size memory cache by processing a sequence of key-value pairs one at a time. Arrange the following events in the correct chronological order for a single update step.
Memory Cache State Calculation