Multiple Choice

An inference system is processing a batch of requests using a dynamic scheduling method. At a specific moment, one request (Request A) completes its generation. The system still has two ongoing requests (Request B and Request C) that require further processing. At the same time, a new request (Request D) arrives. Given this state, which of the following actions by the system's scheduler represents the most efficient use of computational resources in the very next step?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science