Formula for Memory as a Moving Average of Keys and Values
The memory component, , can be defined as a pair of summary vectors calculated by taking the unweighted moving average of the last key and value vectors. This is formally expressed as:

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Formula for Memory as a Moving Average of Keys and Values
Example of a Moving Average-based Cache
Cumulative Average of Keys and Values for Memory Component
Calculating a Memory Component Summary
When using a moving average of the last
nkey-value pairs to create a single summary vector for a memory component, what is the primary effect of significantly increasing the window sizen?Weighted Moving Average for Memory Component
A memory component in a transformer-based model is designed to create a summary by computing the simple, unweighted average of the last 10 key-value pairs. Which statement accurately describes a fundamental property of this specific summarization method?
Learn After
A model's memory component is calculated as the unweighted moving average of the last
n_ckey and value vectors. Given the following sequence of 2-dimensional key (k) and value (v) vectors at four consecutive time steps, and a context window sizen_c = 3, what is the memory component(average_key, average_value)at the fourth time step (i=4)?k_1 = [1, 2],v_1 = [10, 11]k_2 = [3, 4],v_2 = [12, 13]k_3 = [5, 6],v_3 = [14, 15]k_4 = [7, 8],v_4 = [16, 17]Impact of Context Window Size on Memory
Consider the formula for a memory component calculated as an unweighted moving average of the last
n_ckey and value vectors:Mem = ( (Σ k_j) / n_c, (Σ v_j) / n_c ). If the context window sizen_cis increased, the influence of any single key-value pair(k_j, v_j)within that window on the final memory component will also increase.