1Cademy - Information Loss in Fixed-Size Global Memory

Learn Before

Global Tokens in Attention

Concept

Information Loss in Fixed-Size Global Memory

A significant drawback of using a fixed-size global memory, such as a set number of global tokens, is the risk of information loss. As sequence length increases, a small, fixed memory may become insufficient to encapsulate the full context, leading to a trade-off where enlarging the memory (and thus the KV cache) is necessary for better representation but also increases computational costs.

Updated 2026-04-23

Contributors are: