1Cademy - Formula for KV Cache Memory Size

Learn Before

Space Complexity of the KV Cache

Formula

Formula for KV Cache Memory Size

The memory footprint of the Key-Value (KV) cache for a specific context window size can be quantified. The total size is proportional to the product of four key parameters: the number of layers in the model ( $L$ ), the number of attention heads per layer ( $\tau$ ), the dimensionality of each head's key/value vectors ( $d_h$ ), and the size of the context window ( $m_w$ ). The overall memory complexity is therefore given by the formula: $O(L \cdot \tau \cdot d_h \cdot m_w)$ .