Concept

Adadelta

Adadelta is an optimization algorithm that has no explicit learning rate parameter. Instead, it uses the rate of change in the parameters themselves to dynamically adapt the learning rate. To accomplish this, the algorithm utilizes two specific state variables: st\mathbf{s}_t to track a leaky average of the second moment of the gradient, and Δxt\Delta\mathbf{x}_t to track a leaky average of the second moment of the model's parameter changes. The algorithm retains standard naming conventions for these variables to maintain consistency with similar optimization methods like momentum, AdaGrad, and RMSProp.

0

2

Updated 2026-05-16

Tags

Data Science

D2L

Dive into Deep Learning @ D2L