1Cademy - Analyzing Computational Scaling

Learn Before

Computational Cost of Self-Attention in Transformers

Short Answer

Analyzing Computational Scaling

A core mechanism in many modern text-processing models involves calculating a score for every pair of words in an input sequence to determine their relationship. Explain why the computational requirement of this mechanism grows quadratically (i.e., as the square of the sequence length) rather than linearly as the input sequence gets longer.

Updated 2025-10-06

Contributors are:

Who are from:

Learn Before

Related