Learn Before
Multiple Choice

A large neural network model is partitioned across four sequential processing stages, with each stage running on a separate hardware device. During training, a full batch of data is processed entirely by the first device, and its output is then passed to the second device. The second device processes this output and passes its result to the third, and so on. While one device is actively computing, the other three devices are idle, waiting for their turn. What is the primary inefficiency this specific computational strategy introduces?

0

1

Updated 2025-10-01

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Related