Definition

Q/KV Heads Notation in LLM Architectures

When comparing the architectures of different Large Language Models (LLMs), the number of attention heads is often expressed in an a/ba/b format. In this specific notation, aa indicates the number of attention heads used for queries, while bb indicates the number of heads used for both keys and values.

0

1

Updated 2026-04-19

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences