Learn Before
Concept

Minibatch Size Selection Trade-off

Although increasing the minibatch size Bt\mathcal{B}_t reduces the variance of gradient estimates, this benefit exhibits diminishing returns. Beyond a certain point, the additional reduction in standard deviation becomes minimal relative to the linear increase in computational cost per iteration. Therefore, in practice, the minibatch size is chosen to be large enough to offer good computational efficiency and stable gradient estimates, while still fitting within the memory constraints of the hardware, such as a GPU.

0

2

Updated 2026-05-15

Tags

Data Science

D2L

Dive into Deep Learning @ D2L

Learn After