Learn Before
A memory-based attention mechanism updates its fixed-size memory state, Mem, at each time step i using a general recurrent formula: Mem_new = f((k_i, v_i), Mem_old), where (k_i, v_i) is the current key-value pair and Mem_old is the memory state from the previous step. Which of the following update procedures does NOT conform to this recurrent structure?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Neural Network as a Memory Component
Segment-Level Recurrence for Memory Models
A memory-based attention mechanism updates its fixed-size memory state,
Mem, at each time stepiusing a general recurrent formula:Mem_new = f((k_i, v_i), Mem_old), where(k_i, v_i)is the current key-value pair andMem_oldis the memory state from the previous step. Which of the following update procedures does NOT conform to this recurrent structure?Calculating a Recurrent Memory State
Consider a memory update process defined by the recurrent function
Mem_new = f((k_i, v_i), Mem_old), where(k_i, v_i)is the input at the current step andMem_oldis the memory state from the previous step. To compute the memory state for step 100, this process requires direct access to the individual key-value pairs from all 99 preceding steps (i.e., from step 1 to 99).Formula for Memory as a Cumulative Average of Keys and Values