Learn Before
Formula

Individual Attention Head Formula in Multi-Query Attention (MQA)

In Multi-Query Attention (MQA), the output for an individual head jj is calculated using its unique query vector, qi[j]\mathbf{q}_i^{[j]}, while utilizing the Key and Value matrices, K≤i\mathbf{K}_{\le i} and V≤i\mathbf{V}_{\le i}, which are shared across all heads. This is represented by the formula:

headj=Attqkv(qi[j],K≤i,V≤i)\mathrm{head}_j = \mathrm{Att}_{\mathrm{qkv}}(\mathbf{q}_{i}^{[j]},\mathbf{K}_{\le i},\mathbf{V}_{\le i})

Image 0

0

1

Updated 2026-04-23

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences