Learn Before
Code
Square Root Learning Rate Scheduler Implementation
A square root learning rate scheduler can be implemented from scratch as a custom class that acts as a callable function. When invoked with the current number of updates, it calculates and returns the decayed learning rate using the formula . The following Python code demonstrates this implementation:
class SquareRootScheduler: def __init__(self, lr=0.1): self.lr = lr def __call__(self, num_update): return self.lr * pow(num_update + 1.0, -0.5)
0
1
Updated 2026-05-18
Tags
D2L
Dive into Deep Learning @ D2L