Activity (Process)

Scheduler-Driven Batch Adjustments Between Iterations in Continuous Batching

In the continuous batching framework, the inference engine processes requests in a cyclical, iterative manner. A crucial step occurs after each iteration is complete: the scheduler evaluates and may adjust the composition of the active batch. This dynamic, post-iteration management by the scheduler is a key mechanism for adapting to changing workloads, such as by adding new requests, and is fundamental to the efficiency of the process.

0

1

Updated 2026-05-06

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences