1Cademy - A sequential model updates two history-representing variables, μ and ν, at each step `i` using the following rules: μ_i = μ_{i-1} + k_i^T * v_i ν_i = ν_{i-1} + k_i^T Consider the update at a single step `i`. If the input value vector `v_i` is a zero vector (a vector of all zeros), but the input key vector `k_i` is a non-zero vector, what is the outcome of the update from step `i-1` to step `i`?

Learn Before

Recurrent Computation of $\mu_i$ and $\nu_i$ in Linear Attention

Multiple Choice

A sequential model updates two history-representing variables, μ and ν, at each step i using the following rules:

μ_i = μ_{i-1} + k'i^T * v_i ν_i = ν{i-1} + k'_i^T

Consider the update at a single step i. If the input value vector v_i is a zero vector (a vector of all zeros), but the input key vector k'_i is a non-zero vector, what is the outcome of the update from step i-1 to step i?

Updated 2025-09-29

Contributors are:

Who are from:

Learn Before

Related