1Cademy - Analyzing Prefilling Phase Inefficiency

Learn Before

Diagram of the Prefilling Phase

Short Answer

Analyzing Prefilling Phase Inefficiency

An engineer observes that during the initial processing of an input sequence, the time taken to generate all the necessary key and value vectors increases linearly with the number of tokens in the sequence. Based on the typical data flow for this phase, identify the core inefficiency in this observation and describe the correct, more efficient method for generating these vectors.

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related