Learn Before
Compressive Memory Update in Compressive Transformer
In the Compressive Transformer, the compressive memory (CMem) is updated using the key-value pairs that are removed from the local memory (Mem). This process involves two main steps: first, the discarded key-value pairs are compressed by a network, and second, these compressed pairs are added to the compressive memory, which operates as a FIFO queue.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Formula for FIFO Memory Update in Compressive Transformer
Compressive Memory Update in Compressive Transformer
A model's local memory component has a fixed capacity of 4 segments and operates on a First-In, First-Out (FIFO) basis. The memory currently holds the segments [Seg1, Seg2, Seg3, Seg4], where Seg1 is the oldest segment. If a new segment, Seg5, is processed, what will be the resulting state of the memory after the update?
The local memory in a specific transformer model is updated using a First-In, First-Out (FIFO) process to maintain a constant size. Put the two main steps of this update process in the correct order after a new segment of data arrives.
Debugging a Transformer's Memory Behavior
Learn After
Compression of Key-Value Pairs for Compressive Memory
FIFO Update of Compressive Memory
A long-context language model utilizes two distinct memory systems to manage information over time: a primary, fixed-size memory that holds recent, detailed information, and a secondary, compressed memory for older information. The primary memory operates by discarding its oldest entries to accommodate new data. Given this mechanism, what is the most direct source of information for updating the secondary, compressed memory?
A language model is designed with a two-tiered memory system to handle long documents. It has a fixed-size 'short-term memory' for recent, detailed information and a 'long-term memory' for older, summarized information. When a new segment of text is processed, arrange the following events in the correct chronological order to show how information flows between these two memory systems.
Relationship Between Memory Tiers in a Language Model