1Cademy - Loss Gradient over a Mini-batch

Learn Before

Mini-Batch Gradient Descent

Formula

Loss Gradient over a Mini-batch

The expression $\frac{\partial L_{\theta_t}(\mathcal{D}_{\mathrm{mini}})}{\partial \theta_t}$ represents the gradient of the loss function, $L$ , with respect to the model parameters, $\theta_t$ . This gradient is computed on a specific mini-batch of training samples, $\mathcal{D}_{\mathrm{mini}}$ , and indicates the direction of the steepest increase in the loss for that batch.