1Cademy - Common Learning Rate Decay Implementation

Learn Before

Learning Rate Decay
Epoch in Gradient Descent

Concept

Common Learning Rate Decay Implementation

$\alpha = \frac{1}{1 + decay\_rate * epoch\_num}\alpha_0$ , where $\alpha$ is the learning rate in the current epoch, $\alpha_0$ is the initial learning rate, $epoch\_num$ is the current epoch and $decay\_rate$ is the decay rate selected. The decay rate is a tunable hyperparameter.

Initializing $decay\_rate = 1$ and $\alpha_0 = 0.2$ , we graph an example with $epoch\_num$ on x-axis and $\alpha$ on y-axis. In the graph we observe the decay of learning rate.