1Cademy - In a simplified attention mechanism, the history of key-value pairs up to a position `i` is summarized by two state variables: `μ_i`, which is the cumulative sum of outer products between transformed key vectors and their corresponding value vectors (Σ k_jᵀ v_j), and `ν_i`, which is the cumulative sum of the transformed key vectors (Σ k_jᵀ). Given the following sequence of 2-dimensional vectors up to position `i=2`: k_0 = [1, 0], v_0 = [3, 4] k_1 = [0, 2], v_1 = [5, 6] k_2 = [1, 1], v_2 = [7, 8] Calculate the state variables `μ_2` and `ν

Learn Before

State Variables in Linear Attention (μ_i, ν_i)

Multiple Choice

In a simplified attention mechanism, the history of key-value pairs up to a position i is summarized by two state variables: μ_i, which is the cumulative sum of outer products between transformed key vectors and their corresponding value vectors (Σ k'_jᵀ v_j), and ν_i, which is the cumulative sum of the transformed key vectors (Σ k'_jᵀ).

Given the following sequence of 2-dimensional vectors up to position i=2:

k'_0 = [1, 0], v_0 = [3, 4] k'_1 = [0, 2], v_1 = [5, 6] k'_2 = [1, 1], v_2 = [7, 8]

Calculate the state variables μ_2 and ν_2.

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related