Short Answer

Computational Bottlenecks in Autoregressive Generation

During the process of generating text one token at a time, a large language model must repeatedly consult the information from all previously generated tokens. Explain two distinct computational challenges that arise specifically from the self-attention mechanism in this iterative process, particularly as the sequence of generated text grows longer.

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science