Learn Before
Residual Mapping
In a residual network, the desired underlying mapping —the function the network ultimately aims to approximate—is not learned directly by a stack of layers. Instead, those layers are reformulated to learn only the residual mapping , and the target function is recovered as . This reformulation is motivated by the degradation problem: as plain networks grow deeper, their training accuracy can paradoxically worsen, suggesting that the added layers struggle to approximate even the identity function. By recasting the problem in terms of , the identity case reduces to , which is significantly easier for a network to learn because it only requires driving the weights and biases of the constituent layers toward zero.
0
1
Tags
Data Science
D2L
Dive into Deep Learning @ D2L
Related
Recent Variants of ResNets
Advantages of ResNets
Plain vs. ResNets Convolutional Neural Network Architectures
Evaluate ResNet at different depths for ImageNet Classification
Evaluate ResNet models with other state-of-the-art models for ImageNet Classification
Shortcut’s technique for identity mapping
Deep Residual Learning for Image Recognition
Residual Mapping
ResNet Initial Layers
Highway Networks vs. Residual Networks
Influence of Residual Connections on Subsequent Architectures
Adding Layers During Training in Residual Networks
Accelerated Forward Propagation in Residual Networks