1Cademy - Translation Invariance

Learn Before

Concept

Translation Invariance

The principle of translation invariance in computer vision asserts that a network should recognize objects regardless of their location in an image. When applied to constrain a multi-layer perceptron (MLP), this principle dictates that a shift in the input $\mathbf{X}$ must lead to an identical shift in the hidden representation $\mathbf{H}$ . Consequently, the weight tensor $\mathsf{V}$ and bias $\mathbf{U}$ cannot depend on the absolute spatial coordinates (i, j). Using a constant bias $u$ and a shared set of weights $[\mathbf{V}]_{a, b}$ , the hidden representation simplifies to:

$[\mathbf{H}]_{i, j} = u + \sum_a\sum_b [\mathbf{V}]_{a, b} [\mathbf{X}]_{i+a, j+b}$

This weight sharing dramatically reduces the parameter count (e.g., from $10^{12}$ to $4 \times 10^6$ for a 1-megapixel image) and effectively forms a convolution.

0

1

Updated 2026-05-09

Contributors are:

Who are from:

References

Dive into Deep Learning

Learn Before

Related

Learn After