1Cademy - Diagnosing Training Inefficiency

Learn Before

Types of Parallelism in LLM Training

Short Answer

Diagnosing Training Inefficiency

A machine learning team is training a large model partitioned across four accelerators, where each accelerator holds a different sequential segment of the model. They notice that their monitoring tools show a 'bubble' of inactivity that propagates through the accelerators; only one device is active at any given time during a forward or backward pass, leading to poor overall hardware utilization. What specific type of parallelism is designed to solve this exact problem, and how does it achieve better hardware utilization?

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related