Multiple Choice

An inference server is processing user prompts using a dynamic, iterative batching system. At a certain point in time, the server observes that the queue of new, incoming user prompts is empty. However, within the batch currently being processed, one or more text sequences have not yet reached their natural stopping point. Based on these two observations, what is the immediate next step for the batching system?

0

1

Updated 2025-10-01

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science