Concept
Hardware Data Movement Bottlenecks
Deep learning performance depends heavily on the seamless movement of data from durable storage and RAM to the processors (CPUs or GPUs). If data cannot be loaded quickly enough, or if matrices cannot be moved rapidly to the accelerators, the processing elements will starve, creating a major system bottleneck. To achieve optimal performance, systems must efficiently shuffle data and often interleave communication with computation.
0
1
Updated 2026-05-18
Tags
D2L
Dive into Deep Learning @ D2L