Learn Before
Multi-Query Attention (MQA)
Individual Attention Head Formula in Multi-Query Attention (MQA)
In Multi-Query Attention (MQA), the output for an individual head j
is calculated using its unique query vector, , while utilizing the Key and Value matrices, and , which are shared across all heads. This is represented by the formula:

0
1
13 days ago
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Individual Attention Head Formula in Multi-Query Attention (MQA)