1Cademy - Recurrent Computation of $$\mu_i$$ and $$\nu

Learn Before

Linear Causal Attention Formula
Recurrent Models

Formula

Recurrent Computation of $\mu_i$ and $\nu_i$ in Linear Attention

In this model, the variables $\mu_i$ and $\nu_i$ serve as representations of the sequence history up to position $i$ . They are calculated using recurrent forms, effectively summarizing past data: $\mu_i = \mu_{i-1} + \mathbf{k'}_{i}^{\mathrm{T}} \mathbf{v}_{i}$ and $\nu_i = \nu_{i-1} + \mathbf{k'}_{i}^{\mathrm{T}}$ .