Concept

Summary Vectors for Memory Compression in Attention

An alternative to using a sliding window for the memory component (Mem) is to define it as a pair of summary vectors. This approach creates a more compressed representation of the sequence's history, rather than storing a subset of the raw key-value pairs.

0

1

Updated 2026-04-22

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences