Formula

Recurrent Computation of μi\mu_i and νi\nu_i in Linear Attention

In this model, the variables μi\mu_i and νi\nu_i serve as representations of the sequence history up to position ii. They are calculated using recurrent forms, effectively summarizing past data: μi=μi1+kiTvi\mu_i = \mu_{i-1} + \mathbf{k'}_{i}^{\mathrm{T}} \mathbf{v}_{i} and νi=νi1+kiT\nu_i = \nu_{i-1} + \mathbf{k'}_{i}^{\mathrm{T}}.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related