Learn Before
Formula for FIFO Memory Update in Compressive Transformer
The update to the local memory component () in the Compressive Transformer follows a First-In, First-Out (FIFO) rule, which is expressed with the formula: . In this equation, represents the memory state prior to the update. The function appends the key-value pairs from the newest segment, , to the memory, and removes the oldest key-value pairs to maintain a fixed size, yielding the new memory state, .

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Formula for FIFO Memory Update in Compressive Transformer
Compressive Memory Update in Compressive Transformer
A model's local memory component has a fixed capacity of 4 segments and operates on a First-In, First-Out (FIFO) basis. The memory currently holds the segments [Seg1, Seg2, Seg3, Seg4], where Seg1 is the oldest segment. If a new segment, Seg5, is processed, what will be the resulting state of the memory after the update?
The local memory in a specific transformer model is updated using a First-In, First-Out (FIFO) process to maintain a constant size. Put the two main steps of this update process in the correct order after a new segment of data arrives.
Debugging a Transformer's Memory Behavior
Learn After
A memory component in a sequence processing model is updated using a First-In, First-Out (FIFO) rule. The memory has a fixed capacity of 3 segments. The current state of the memory, ordered from oldest to newest, is
[Segment_A, Segment_B, Segment_C]. A new segment,Segment_D, is processed. According to the FIFO update rule, what will be the new state of the memory after the update?A sequence processing model uses the formula
Mem_new = FIFO(S_new, Mem_old)to update its fixed-size memory. Match each component of the formula to its correct description.The formula
Mem = FIFO(S_kv^k, Mem_pre)describes an update to a fixed-size memory buffer in a sequence processing model. Arrange the operations below into the correct chronological order that this formula represents.