Concept

Tangent Propagation Algorithm

  • Similar to tangent distance algorithm
  • Closely related to dataset augmentation, both require that the model be invariant to certain specified directions of change in the input. Dataset augmentation is the non-infinitesimal version of tangent propagation
  • Trains a neural net classifier with extra penalty to make each output of the neural net locally invariant to known factors of variation
  • Factors correspond to movement along the manifold near which examples of the same class concentrate.
  • Local invariance achieved by requiring xf(x)\triangledown_x f(x) to be orthogonal to known manifold tangent vectors v(i)v^{(i)} at xx
  • Equivalently, the directional derivative of ff at xx in the directions v(i)v^{(i)} be small by adding a regularization penalty ω\omega, defined as: Ω(f)=i((xf(x))v(i))2\Omega(f) = \sum_{i} ((\triangledown_x f(x))^\top v^{(i)})^2, which can be scaled by a hyperparameter, and for most neural networks, we would need to sum over many outputs
  • Tangent vectors are derived a priori, usually from knowledge of the effect of transformations
  • Has been used for supervised learning and reinforcement learning
  • User encodes prior knowledge of task by specifying a set of transformations that should not alter the output, and analytically regularizes the model to resist perturbation in the directions corresponding to the specified transformation
  • Only regularizes the model to resist infinitesimal perturbation, and poses difficulties for models based on rectified linear units
  • Related to double backprop and adversarial training, both of which require that the model should be invariant to all directions of change in the input as long as the change is small
  • Double backprop regularizes the Jacobian to be small
  • Adversarial training finds inputs near the original inputs and trains the model to produce the same output on these as on the original inputs
  • Adversarial training is the non-infinitesimal version of double backprop

0

1

Updated 2021-06-24

References


Tags

Data Science