Learn Before
KV Cache State During Generation
Based on the provided scenario, describe two things: 1) What is the exact content of the key cache after it is updated for this step? 2) Which specific key vectors will the new query vector q_3 attend to for its calculation?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An autoregressive model is generating a sequence of outputs one step at a time. At step 't', the model has already processed all inputs from step 1 to 't-1' and stored their corresponding key-value pairs. To calculate the output for the current step 't', a new query vector (q_t) is generated. Which set of key vectors must this new query vector attend to in order to correctly incorporate all available context?
An autoregressive model is generating the next token in a sequence and has already processed the first 'N' tokens, with their corresponding key-value pairs stored in a cache. For the generation of the '(N+1)th' token, arrange the following actions in the correct chronological order.
KV Cache State During Generation