Definition

Notation for Current Query, Key, and Value Vectors (q', k', v')

In autoregressive models, a new set of vectors is generated for the current token at position i': a query vector (q' or q_{i'}), a key vector (k' or k_{i'}), and a value vector (v' or v_{i'}). The new query q' interacts with all previous key vectors to compute attention scores. The new key k' and value v' are then appended to the Key-Value cache, making them available for subsequent tokens.

Image 0

0

1

Updated 2026-02-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences