Multiple Choice

An autoregressive language model is generating a sequence of tokens one by one. As the length of the generated sequence increases from 10 tokens to 100 tokens, what is the primary impact of the evolving key-value cache on the computation required to generate the next token?

0

1

Updated 2025-10-07

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science