Learn Before
An inference engine employing a continuous batching strategy is initialized and presented with a queue of 10 pending user requests. In forming the very first batch to begin processing, which of the following is the most critical constraint determining how many of these requests can be grouped together?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Example of Initial Batch Creation in Continuous Batching
Batching Sequences of Varying Lengths
Assembling an Initial Processing Batch
An inference engine employing a continuous batching strategy is initialized and presented with a queue of 10 pending user requests. In forming the very first batch to begin processing, which of the following is the most critical constraint determining how many of these requests can be grouped together?
When an inference engine using continuous batching forms its initial batch, it is required to include all user requests that are currently pending in the queue, regardless of system limitations.