Formula

Formula for Updating the Key Matrix in the KV Cache

During autoregressive inference, the Key matrix (K) in the KV cache is expanded at each step. The new key vector, ki\mathbf{k}_{i'}, corresponding to the current token, is appended to the existing matrix of keys. This update operation is expressed by the formula: K=Append(K,ki)\mathbf{K} = \text{Append}(\mathbf{K}, \mathbf{k}_{i'})

Image 0

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences