Case Study

State of an Autoregressive Cache

An autoregressive language model generates text one token at a time. To generate the next token in a sequence, it must consider all the tokens that came before it. To make this process efficient, the model maintains a 'cache' of key and value vectors for every token it has already produced.

Given this mechanism, if the model has just finished generating the sequence 'The quick brown', what information does this cache now hold in preparation for generating the next token?

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science