Concept

Compressive Transformer Memory Architecture

Segment-level memory models can be extended to utilize multiple memory components. The Compressive Transformer is a prime example of this architecture, employing two distinct, fixed-size memories to manage different historical contexts. It maintains a local memory, denoted by Mem\mathrm{Mem}, to capture recent context, alongside a secondary memory, denoted by CMem\mathrm{CMem}, which models and compresses older, long-term history. In this model, the Key-Value (KV) cache is formed by the combination of both Mem\mathrm{Mem} and CMem\mathrm{CMem}.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related
Learn After