Multiple Choice

In a specific attention mechanism, there are 8 query heads (indexed j=1 to 8) and 2 distinct Key-Value (KV) groups (indexed g=1 to 2). Query heads 1 through 4 are assigned to KV group 1, while query heads 5 through 8 are assigned to KV group 2. The output for a given query head j is calculated based on its own query vector q^[j] and the Key-Value pair from its assigned group, (K^[g(j)], V^[g(j)]). Which Key-Value pair will query head 6 use for its computation?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science