Multiple Choice

In the formula for calculating a linear attention output, Output = (q'_i * μ_i) / (q'_i * ν_i), where q'_i is the transformed query, μ_i is the accumulated key-value state, and ν_i is the accumulated key state, what is the primary role of the denominator term q'_i * ν_i?

0

1

Updated 2026-04-22

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science