1Cademy - Analyzing a System State in Continuous Batching

Learn Before

Example of the Second Decoding Step in Continuous Batching (Iteration 3)

Short Answer

Analyzing a System State in Continuous Batching

An LLM inference engine using a continuous batching strategy is processing two separate requests simultaneously. After a recent computational step, the state is as follows:

Request 1 has generated the sequence: "The cat sat on"
Request 2 has generated the sequence: "Once upon a"

In the immediately preceding step, the generated sequences were "The cat sat" and "Once upon", respectively. No new requests have arrived, and neither request has finished generating.

Based on this change in state, describe the single computational operation that was just performed and explain why this operation is a key feature of the iterative decoding phase for a batch.

0

1

Updated 2025-10-09

Contributors are:

Who are from:

Learn Before

Related