1Cademy - Comparing KV Cache Memory Growth

Learn Before

Reducing KV Cache Complexity via Windowed Caching

Short Answer

Comparing KV Cache Memory Growth

An auto-regressive language model is processing an extremely long document. Compare the growth of its Key-Value (KV) cache memory usage over time under two different scenarios: (1) a standard caching mechanism that stores all previous tokens, and (2) a windowed caching mechanism that only stores the most recent 1024 tokens. Explain the fundamental difference in their space complexity as the sequence length increases.

Updated 2025-10-06

Contributors are:

Who are from:

Learn Before

Related