1Cademy - Generalization Paradox in Deep Learning

Learn Before

Concept

Generalization Paradox in Deep Learning

While guarantees from classical learning theory (such as bounds based on the Vapnik-Chervonenkis (VC) dimension or Rademacher complexity) can be conservative even for classical models, they appear powerless to explain why deep neural networks generalize. For classification problems, deep models are typically expressive enough to perfectly fit arbitrary labels even for datasets consisting of millions of examples. In the classical picture, such extreme model complexity—even when utilizing familiar methods like $\ell_2$ regularization—should lead to severe overfitting. Paradoxically, despite perfectly fitting the training data with zero training error, these highly expressive models often generalize remarkably well to unseen data, contradicting traditional complexity-based generalization bounds.

0

1

Updated 2026-05-07

Contributors are:

Who are from:

References

Learn Before

Related

Learn After