Learn Before
Concept

Common Learning Rate Decay Implementation

α=11+decay_rateepoch_numα0\alpha = \frac{1}{1 + decay\_rate * epoch\_num}\alpha_0 , where α\alpha is the learning rate in the current epoch, α0\alpha_0 is the initial learning rate, epoch_numepoch\_num is the current epoch and decay_ratedecay\_rate is the decay rate selected. The decay rate is a tunable hyperparameter.

Initializing decay_rate=1decay\_rate = 1 and α0=0.2\alpha_0 = 0.2, we graph an example with epoch_numepoch\_num on x-axis and α\alpha on y-axis. In the graph we observe the decay of learning rate.

Image 0

0

2

Updated 2020-11-16

Tags

Data Science