Learn Before
A system uses a moving average-based cache that summarizes the four most recent key-value pairs into a single summary pair. The current summary is calculated from pairs at time steps t-4, t-3, t-2, and t-1. When a new key-value pair arrives at time step t, how is the cache updated?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Calculating a Compressed Memory State
A system's memory component compresses the four most recent key-value pairs into a single summary pair by independently averaging their respective vectors. What is the most significant trade-off inherent in this compression technique?
A system uses a moving average-based cache that summarizes the four most recent key-value pairs into a single summary pair. The current summary is calculated from pairs at time steps
t-4,t-3,t-2, andt-1. When a new key-value pair arrives at time stept, how is the cache updated?