Activity (Process)

Deep Learning Minibatch Training Loop

The main training loop for a deep learning model executes a systematic, iterative process to optimize parameters w\mathbf{w} and bb. During each epoch, the loop passes through the entire training dataset. For every iteration within an epoch, a minibatch B\mathcal{B} is processed. The model computes the loss for this minibatch, averages it over the examples in the batch, and calculates the gradients of the averaged loss with respect to each parameter using g(w,b)1BiBl(x(i),y(i),w,b)\mathbf{g} \leftarrow \partial_{(\mathbf{w},b)} \frac{1}{|\mathcal{B}|} \sum_{i \in \mathcal{B}} l(\mathbf{x}^{(i)}, y^{(i)}, \mathbf{w}, b). Because the loss is averaged per minibatch, the gradient in the optimization algorithm does not need to be separately divided by the batch size. Finally, the optimization algorithm updates the parameters using the rule (w,b)(w,b)ηg(\mathbf{w}, b) \leftarrow (\mathbf{w}, b) - \eta \mathbf{g}. This cycle repeats until the training is complete.

0

1

Updated 2026-05-24

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L