Learn Before
Concept

Dilemma of Initial Learning Rate

When training advanced neural network designs, initializing the parameters is sometimes insufficient to guarantee stable optimization. This creates an optimization dilemma: choosing a sufficiently small initial learning rate prevents early divergence but results in extremely slow progress, whereas choosing a large initial learning rate leads to immediate divergence.

0

1

Updated 2026-05-18

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L

Learn After