Learn Before
Constant Memory Size in Fixed-Window Attention
Consider the formula for constructing a memory component in a fixed-size window attention mechanism: Mem = (K_[i-n_c+1,i], V_[i-n_c+1,i]). Explain mathematically why the number of key-value pairs in Mem is always equal to the context window size, n_c, for any processing step i where i ≥ n_c.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An attention mechanism is processing the 10th element of a sequence (where the current index
i=10) and is configured with a context window size of 5 (n_c=5). Based on the standard formula for constructing a memory component from a fixed-size window,Mem = (K_[i-n_c+1,i], V_[i-n_c+1,i]), which set of key vectors (represented by their indices) would be included in the key matrixKat this step?In an attention mechanism using a fixed-size window, the memory component at step
iis constructed using the formulaMem = (K_[i-n_c+1,i], V_[i-n_c+1,i]), wheren_cis the context window size. What is the direct consequence of increasing the value ofn_c?Constant Memory Size in Fixed-Window Attention