1Cademy - Tangent Propagation Algorithm

Learn Before

Popular Regularization Techniques in Deep Learning

Concept

Tangent Propagation Algorithm

Similar to tangent distance algorithm
Closely related to dataset augmentation, both require that the model be invariant to certain specified directions of change in the input. Dataset augmentation is the non-infinitesimal version of tangent propagation
Trains a neural net classifier with extra penalty to make each output of the neural net locally invariant to known factors of variation
Factors correspond to movement along the manifold near which examples of the same class concentrate.
Local invariance achieved by requiring $\triangledown_x f(x)$ to be orthogonal to known manifold tangent vectors $v^{(i)}$ at $x$
Equivalently, the directional derivative of $f$ at $x$ in the directions $v^{(i)}$ be small by adding a regularization penalty $\omega$ , defined as: $\Omega(f) = \sum_{i} ((\triangledown_x f(x))^\top v^{(i)})^2$ , which can be scaled by a hyperparameter, and for most neural networks, we would need to sum over many outputs
Tangent vectors are derived a priori, usually from knowledge of the effect of transformations
Has been used for supervised learning and reinforcement learning
User encodes prior knowledge of task by specifying a set of transformations that should not alter the output, and analytically regularizes the model to resist perturbation in the directions corresponding to the specified transformation
Only regularizes the model to resist infinitesimal perturbation, and poses difficulties for models based on rectified linear units
Related to double backprop and adversarial training, both of which require that the model should be invariant to all directions of change in the input as long as the change is small
Double backprop regularizes the Jacobian to be small
Adversarial training finds inputs near the original inputs and trains the model to produce the same output on these as on the original inputs
Adversarial training is the non-infinitesimal version of double backprop

0

1

Updated 2021-06-24

Contributors are:

Evan Hitsky

🏆 2

Who are from:

University of Michigan - Ann Arbor

🏆 2

References

Deep Learning

Learn Before

Related