1Cademy - Components of Fixed-Size KV Caches

Learn Before

Fixed-Size KV Cache for Long-Context Inference

Concept

Components of Fixed-Size KV Caches

In Large Language Models (LLMs), fixed-size KV caches optimize memory by managing different sets of keys and values. This includes the keys and values dynamically generated during active inference, those preserved in the model's primary memory, and those stored or encoded in a compressed memory to retain older contextual information without exceeding the fixed memory capacity.

Updated 2026-04-23

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course

Learn Before

Related