Concept

Multi-Dimensional Structure of the KV Cache

The Key-Value (KV) cache in Transformer models is a dynamic data structure whose size is determined by several dimensions. These dimensions include the number of layers in the model (LL), the number of attention heads per layer (τ\tau), and the length of the input sequence. Each attention head also contributes a key/value vector of a specific dimensionality (dhd_h), making the overall cache a multi-dimensional entity.

0

1

Updated 2026-04-23

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related