1Cademy - Symbolic Representation of Layer-wise Parallelism

Learn Before

Example of Model Parallelism with a Transformer Decoder

Definition

Symbolic Representation of Layer-wise Parallelism

In layer-wise parallelism, the computation for a given block, denoted as $l$ , is represented by $\mathrm{B}_l$ . The forward pass is symbolized by an upward arrow ( $\uparrow$ ), and the backward pass by a downward arrow ( $\downarrow$ ). For instance, the forward and backward computations for Block 1 would be shown as $\mathrm{B}_1$ ( $\uparrow$ ) and $\mathrm{B}_1$ ( $\downarrow$ ), while for Block 2 they would be $\mathrm{B}_2$ ( $\uparrow$ ) and $\mathrm{B}_2$ ( $\downarrow$ ).