1Cademy - Computational Bottlenecks in Autoregressive Generation

Learn Before

Self-Attention as a Source of Inference Difficulty in Transformers

Short Answer

Computational Bottlenecks in Autoregressive Generation

During the process of generating text one token at a time, a large language model must repeatedly consult the information from all previously generated tokens. Explain two distinct computational challenges that arise specifically from the self-attention mechanism in this iterative process, particularly as the sequence of generated text grows longer.

Updated 2025-10-05

Contributors are:

Who are from:

Learn Before

Related