1Cademy - Concise Adam Implementation

Learn Before

Adam Optimizer From-Scratch Implementation

Code

Concise Adam Implementation

Similar to other optimization algorithms, the Adam optimizer can be implemented concisely using the high-level APIs provided by modern deep learning frameworks. Instead of manually initializing the state variables for momentum and the second moment, applying bias correction, and writing the update equations from scratch, practitioners can directly instantiate built-in optimizer classes. These built-in implementations handle the internal state tracking and numerical stability adjustments automatically. For example, in PyTorch, this is accomplished by instantiating torch.optim.Adam; in TensorFlow, by using tf.keras.optimizers.Adam; and in MXNet's Gluon API, by specifying the algorithm as 'adam'. The only requirement is to pass the appropriate configuration hyperparameters, such as the learning rate ( $\eta$ ), to the built-in function. In PyTorch, this can be written concisely as:

trainer = torch.optim.Adam
d2l.train_concise_ch11(trainer, {'lr': 0.01}, data_iter)

Updated 2026-05-16

Contributors are:

Who are from:

References

Dive into Deep Learning

Learn Before

Related