Concept

NVLink Connectivity in Multi-GPU Servers

Modern deep learning hardware often features bespoke network connectivity to handle large data transfers efficiently. For example, in an 88-GPU server, each GPU typically connects to a host CPU via a PCIe link operating at around 1616 GB/s. Simultaneously, each GPU may have multiple NVLink connections to other GPUs, each capable of bidirectionally transferring data at much higher speeds (e.g., 300300 Gbit/s or roughly 1818 GB/s per direction). Because the aggregate NVLink bandwidth significantly exceeds the PCIe bandwidth, maximizing training efficiency requires specialized synchronization protocols that exploit this hardware architecture.

Image 0

0

1

Updated 2026-05-18

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L