1Cademy - Comparing Gradient Calculation Methods

Learn Before

Loss Gradient over a Mini-batch

Short Answer

Comparing Gradient Calculation Methods

Consider two scenarios for updating a model's parameters: one using the gradient calculated from a single, small subset of the training data, and the other using the gradient calculated from the entire training dataset. Explain the fundamental difference in the information provided by these two gradients and justify why, despite this difference, using the gradient from the small subset is a standard and effective practice in training large models.

Updated 2025-10-08

Contributors are:

Who are from:

Learn Before

Related