Example

Computational Graph of a Multi-Device Two-Layer MLP

When training a simple two-layer Multi-Layer Perceptron (MLP) distributed across multiple devices, such as a CPU and two GPUs, the system forms a complex computational graph with strict dependencies between computation and communication. For example, a computed gradient on a GPU must be ready before it can be transferred to the CPU. Manually scheduling the parallel and sequential execution of these intertwined operations would be exceedingly difficult, which makes relying on a graph-based computing backend highly advantageous for automatic optimization.

Image 0

0

1

Updated 2026-05-18

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L