Learn Before
An autoregressive model is generating the next token in a sequence and has already processed the first 'N' tokens, with their corresponding key-value pairs stored in a cache. For the generation of the '(N+1)th' token, arrange the following actions in the correct chronological order.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Comprehension in Revised Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An autoregressive model is generating a sequence of outputs one step at a time. At step 't', the model has already processed all inputs from step 1 to 't-1' and stored their corresponding key-value pairs. To calculate the output for the current step 't', a new query vector (q_t) is generated. Which set of key vectors must this new query vector attend to in order to correctly incorporate all available context?
An autoregressive model is generating the next token in a sequence and has already processed the first 'N' tokens, with their corresponding key-value pairs stored in a cache. For the generation of the '(N+1)th' token, arrange the following actions in the correct chronological order.
KV Cache State During Generation