1Cademy - Example of Reusing a Completed Slot in Continuous Batching (Iteration 6)

Learn Before

Example of a Request Completing in Continuous Batching (Iteration 5)

Example

Example of Reusing a Completed Slot in Continuous Batching (Iteration 6)

This diagram illustrates the sixth iteration in a continuous batching process, which occurs after request x₂ has completed. New requests, x₄ and x₅, arrive in the system. The scheduler dynamically adjusts the batch by using the resources freed up by the completed request x₂ to accommodate the new request x₄. In a single computational step, the system concurrently performs the prefilling phase for x₄ while also executing a single decoding step for the ongoing requests x₁ and x₃. This highlights a key efficiency of continuous batching: the immediate reuse of resources to interleave the processing of new and existing requests, thereby maximizing throughput.

0

1

Updated 2025-10-09

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related

Learn After