1Cademy - Cosine Learning Rate Scheduler Implementation

Learn Before

Cosine Learning Rate Schedule

Code

Cosine Learning Rate Scheduler Implementation

A cosine learning rate scheduler can be implemented from scratch as a custom Python class. The class calculates the decayed learning rate based on the current step according to a cosine curve. It optionally includes a warmup phase where the learning rate increases linearly. Once past the warmup steps but within the maximum update steps, the learning rate is determined by the formula $\eta_t = \eta_T + \frac{\eta_0 - \eta_T}{2} \left(1 + \cos\left(\frac{\pi (t - t_{\text{warmup}})}{T_{\text{max\_steps}}}\right)\right)$ . The following code demonstrates this implementation and plots the resulting schedule:

class CosineScheduler:
    def __init__(self, max_update, base_lr=0.01, final_lr=0,
               warmup_steps=0, warmup_begin_lr=0):
        self.base_lr_orig = base_lr
        self.max_update = max_update
        self.final_lr = final_lr
        self.warmup_steps = warmup_steps
        self.warmup_begin_lr = warmup_begin_lr
        self.max_steps = self.max_update - self.warmup_steps

    def get_warmup_lr(self, epoch):
        increase = (self.base_lr_orig - self.warmup_begin_lr)  * float(epoch) / float(self.warmup_steps)
        return self.warmup_begin_lr + increase

    def __call__(self, epoch):
        if epoch < self.warmup_steps:
            return self.get_warmup_lr(epoch)
        if epoch <= self.max_update:
            self.base_lr = self.final_lr + (
                self.base_lr_orig - self.final_lr) * (1 + math.cos(
                math.pi * (epoch - self.warmup_steps) / self.max_steps)) / 2
        return self.base_lr

scheduler = CosineScheduler(max_update=20, base_lr=0.3, final_lr=0.01)
d2l.plot(torch.arange(num_epochs), [scheduler(t) for t in range(num_epochs)])

0

1

Updated 2026-05-18

Contributors are:

Who are from:

References

Dive into Deep Learning

Learn Before

Related