Learn Before
Cosine Learning Rate Schedule
A cosine learning rate schedule, proposed by Loshchilov and Hutter (2016), dynamically adjusts the learning rate by following the shape of a cosine curve. It relies on the observation that the learning rate should not decrease too drastically at the beginning of training, and that the solution should be refined at the end using a very small learning rate. For learning rates in the range , this results in a schedule with the functional form:
Here, is the initial learning rate and is the target rate at the maximum update step . For steps , the learning rate is simply pinned to without increasing it again.
0
1
Tags
D2L
Dive into Deep Learning @ D2L
Related
Effect of Learning Rate Scheduling on Overfitting
Polynomial Learning Rate Decay
Piecewise Constant Learning Rate Schedule
Cosine Learning Rate Schedule
Optimizer Warmup
Factor Learning Rate Scheduler
Explicit Learning Rate Adjustment Implementation
Learning Rate Scheduler Toy Problem
Square Root Learning Rate Scheduler