Formula

Set of Sequential Key-Value Pairs

This represents a collection of key-value vector pairs for all positions up to and including index i within a sequence. The notation {(Ki[1],Vi[1]),,(Ki[au],Vi[au])}\{(\mathbf{K}^{[1]}_{\leq i}, \mathbf{V}^{[1]}_{\leq i}), \dots, (\mathbf{K}^{[ au]}_{\leq i}, \mathbf{V}^{[ au]}_{\leq i})\} illustrates this set, where \tau is the total length of the sequence. This structure is fundamental in attention mechanisms, particularly in autoregressive decoding, where it's used to cache past key-value states for efficient computation of subsequent steps.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related