1Cademy - Example of the First Decoding Step in Continuous Batching (Iteration 2)

Learn Before

Example of Initial Batch Creation in Continuous Batching

Example

Example of the First Decoding Step in Continuous Batching (Iteration 2)

This diagram illustrates the second iteration in a continuous batching process, which follows the initial prefilling of requests x1 and x2. In this step, the scheduler directs the inference engine to perform a single decoding operation for the entire batch. This concurrently generates the first output token for both request x1 and request x2, demonstrating how the system transitions from the prefilling phase to the iterative decoding phase for a group of requests.