Concept

Batch Normalization in Convolutional Layers

In convolutional layers, batch normalization is typically applied after the convolution operation but before the nonlinear activation function. To preserve the translation invariance of convolutions, the normalization is executed on a per-channel basis simultaneously across all spatial locations. For a minibatch containing mm examples and an output feature map with height pp and width qq, the mean and variance are calculated over all mpqm \cdot p \cdot q elements for each individual channel. Consequently, each channel utilizes the same scalar scale and shift parameters to normalize values at every spatial location.

0

1

Updated 2026-05-13

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L