1Cademy - A large neural network decoder, consisting of 12 sequential processing blocks, is distributed across 12 separate workers, with each worker assigned exactly one block. For a single input, the computation proceeds sequentially through the workers from 1 to 12 during the forward pass, and then in reverse from 12 to 1 during the backward pass. What is the primary factor limiting the overall computational efficiency of this specific arrangement?

Learn Before

Example of Model Parallelism with a Transformer Decoder

Multiple Choice

A large neural network decoder, consisting of 12 sequential processing blocks, is distributed across 12 separate workers, with each worker assigned exactly one block. For a single input, the computation proceeds sequentially through the workers from 1 to 12 during the forward pass, and then in reverse from 12 to 1 during the backward pass. What is the primary factor limiting the overall computational efficiency of this specific arrangement?

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related