Learn Before
Concept
Inductive Bias of Residual Connections
In deep learning, adding capacity by nesting function classes allows for strictly more powerful, rather than just subtly different, function classes. Residual connections accomplish this by allowing additional layers to pass the input directly to the output. Consequently, this architectural choice shifts the network's inductive bias: instead of assuming that simple functions take the form , the network assumes that simple functions look like . This makes it significantly easier for the residual mapping to learn the identity function by pushing the parameters in the weight layer toward zero.
0
1
Updated 2026-05-13
Tags
D2L
Dive into Deep Learning @ D2L