1Cademy - Comparison of Processing in Chunked vs. Standard Prefilling

Learn Before

Chunked Prefilling
Prefilling in One Go (Standard Prefilling)

Comparison

Comparison of Processing in Chunked vs. Standard Prefilling

Standard prefilling processes an entire input sequence in a single forward pass to construct the Key-Value (KV) cache all at once. In contrast, chunked prefilling operates sequentially on smaller segments of the input, requiring a distinct forward pass for each chunk to compute its attention outputs and progressively update the KV cache.

Updated 2026-05-06

Contributors are: