1Cademy - The prefilling phase of a large language model is considered a memory-bound process because the parallel computation of self-attention across the entire input sequence necessitates frequent and rapid data transfers to and from the processing units memory.

Learn Before

Prefilling as a Compute-Bound Process

True/False

The prefilling phase of a large language model is considered a memory-bound process because the parallel computation of self-attention across the entire input sequence necessitates frequent and rapid data transfers to and from the processing unit's memory.

Updated 2025-10-07

Contributors are: