Learn Before
Example
RMSProp Training on Airfoil Dataset
When training a linear regression model from scratch on the Airfoil Self-Noise dataset using the RMSProp optimizer with an initial learning rate of 0.01, a decay parameter , and a batch size of , the training loss converges to approximately 0.245. This demonstrates that RMSProp can effectively train deep network models when the learning rate and decay factor are configured appropriately. The typical hyperparameter configuration uses a modest learning rate paired with a high decay factor, contrasting with AdaGrad which often demands a larger initial learning rate to counteract its aggressive decay.
0
1
Updated 2026-05-15
Tags
D2L
Dive into Deep Learning @ D2L