Learn Before
Example
RMSProp Training on Airfoil Dataset
When training a linear regression model from scratch on the Airfoil Self-Noise dataset using the RMSProp optimizer with an initial learning rate of , a decay parameter , and a batch size of , the training loss converges to approximately . This demonstrates that RMSProp can effectively train deep network models when the learning rate and decay factor are configured appropriately. The typical hyperparameter configuration uses a modest learning rate paired with a high decay factor, contrasting with AdaGrad which often demands a larger initial learning rate to counteract its aggressive decay.
0
1
Updated 2026-05-15
Tags
D2L
Dive into Deep Learning @ D2L