1Cademy - Formula for Prefix Cache State Generation

Learn Before

Process of Generating Prefix Caches

Formula

Formula for Prefix Cache State Generation

During the prefilling phase for an input sequence $\mathbf{x}$ , we generate a sequence of prefixes and their corresponding Key-Value (KV) cache states. This mapping is defined as:

$\begin{matrix} x_0 (\mathbf{x}_{<1}) & \Rightarrow & \mathrm{cache}_{<1} x_0 x_1 (\mathbf{x}_{<2}) & \Rightarrow & \mathrm{cache}_{<2} & ... & x_0 x_1 ... x_{m-1} (\mathbf{x}_{<m}) & \Rightarrow & \mathrm{cache}_{<m} \end{matrix}$

where $\mathrm{cache}_{<i}$ denotes the KV cache state for the prefix $\mathbf{x}_{<i}$ . All these mappings can be stored in the prefix cache for efficient reuse.

Updated 2026-05-05

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course

Learn After

An auto-regressive model is processing the input sequence of tokens: ['The', 'cat', 'sat']. When the model uses the prefix ['The', 'cat'] to generate the next token, 'sat', what is the content of the corresponding Key-Value (KV) cache state that is created at this step?
An auto-regressive model is generating a series of Key-Value (KV) cache states for the input sequence of tokens: ['The', 'quick', 'brown']. Arrange the following events in the correct chronological order in which they occur during this process.
Prefix Cache Reuse Scenario

Learn Before

Related

Learn After