Learn Before
Concept

Inductive Bias of Residual Connections

In deep learning, adding capacity by nesting function classes allows for strictly more powerful, rather than just subtly different, function classes. Residual connections accomplish this by allowing additional layers to pass the input directly to the output. Consequently, this architectural choice shifts the network's inductive bias: instead of assuming that simple functions take the form f(x)=0f(\mathbf{x}) = 0, the network assumes that simple functions look like f(x)=xf(\mathbf{x}) = \mathbf{x}. This makes it significantly easier for the residual mapping to learn the identity function by pushing the parameters in the weight layer toward zero.

0

1

Updated 2026-05-13

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L