1Cademy - Backward Pass Latency in Sequential Model Parallelism

Learn Before

Process Flow in Layer-wise Model Parallelism

Short Answer

Backward Pass Latency in Sequential Model Parallelism

A deep neural network is trained using a setup where consecutive layers are distributed across different workers. An engineer observes that during the backward pass, the worker holding the initial layers of the model is the last one to complete its computations for any given data batch. Based on the data flow of this process, explain why this observation is expected.

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related