Formula

Linear Attention Output Calculation

In this variant of linear attention, the final output is calculated by combining the current transformed query vector qi\mathbf{q'}_i with the accumulated state variables μi\mu_i and νi\nu_i. The numerator is the product of the query and the key-value state μi\mu_i, while the denominator is the product of the query and the key state νi\nu_i, serving as a normalization term. The formula is: Attlinear(qi,Ki,Vi)=qiμiqiνi\text{Att}_{\text{linear}}(\mathbf{q}_i, \mathbf{K}_{\le i}, \mathbf{V}_{\le i}) = \frac{\mathbf{q'}_i \mu_i}{\mathbf{q'}_i \nu_i} This approach replaces the standard Softmax operation with simpler matrix-vector products, leading to computational savings.

Image 0

0

1

Updated 2026-04-22

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences