Example

Example of the Second Decoding Step in Continuous Batching (Iteration 3)

This diagram illustrates the third iteration in the continuous batching example, continuing from the first decoding step. In this stage, the scheduler again directs the inference engine to perform a single decoding operation for the batch containing requests x1 and x2. This action generates the second output token for each of the two requests, demonstrating the ongoing, iterative nature of the decoding phase.

Image 0

0

1

Updated 2025-10-09

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences