Concept

Simple Iteration-level Scheduling

Simple iteration-level scheduling is a strategy used in systems like continuous batching where decisions are made at each discrete computational step, or iteration. In any given iteration, the scheduler assigns a single task—such as one decoding step or one chunk of a prefill operation—to each sequence in the active batch. This method enables the fine-grained interleaving of different computational tasks, such as processing a new request's prefill concurrently with the decoding steps of ongoing requests.

0

1

Updated 2025-10-07

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences