1Cademy - Iteration in Continuous Batching

Learn Before

Continuous Batching for LLM Inference

Definition

Iteration in Continuous Batching

In continuous batching, an iteration represents a distinct step in computation, corresponding to either the full prefilling phase for a given input or a single token's decoding step. For instance, given an input sequence $\mathbf{x}=x_0...x_m$ and a target output sequence $\mathbf{y}=y_1...y_n$ , processing requires a total of $n+1$ iterations. This includes one initial iteration to handle prefilling, followed by $n$ iterations to generate the output sequence, yielding one token per iteration.