Learn Before
Consider a language model generating a sequence of text one token at a time after being given an initial prompt. For the generation of the tenth token in the output sequence, the newly created query vector will attend to a set of key and value vectors derived only from the nine previously generated tokens.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An autoregressive language model is generating a sequence one token at a time. It has already processed the initial input 'The cat sat on the' and has subsequently generated the tokens 'mat and'. The model is now in the process of generating the token that will follow 'and'. What set of key and value vectors will the new query vector for this step attend to?
Consider a language model generating a sequence of text one token at a time after being given an initial prompt. For the generation of the tenth token in the output sequence, the newly created query vector will attend to a set of key and value vectors derived only from the nine previously generated tokens.
Dynamic K/V Cache in Transformer Decoding