Concept

Applying Batch Normalization to LeNet

Batch normalization can be integrated into the classic LeNet-5 architecture by inserting a batch normalization layer after each convolutional or fully connected layer but before the corresponding activation function. The resulting network, sometimes called BNLeNet, retains the same layer progression as the original LeNet—two convolutional blocks (each followed by sigmoid activation and average pooling) and three fully connected layers with 120120, 8484, and 1010 output units—but places a batch normalization operation at every layer that produces learnable features. For the convolutional layers, the batch normalization operates in four-dimensional mode (over the channel dimension across all spatial locations), while for the fully connected layers it operates in two-dimensional mode (over the feature dimension). This architecture can be implemented either from scratch using a custom batch normalization class, or concisely using built-in high-level API layers (such as nn.LazyBatchNorm2d and nn.LazyBatchNorm1d in PyTorch). The concise version produces virtually identical code but eliminates the need to manually specify dimensionality arguments. Training this modified network on the Fashion-MNIST dataset with a batch size of 128128 and a learning rate of 0.10.1 for 1010 epochs demonstrates how batch normalization integrates seamlessly into existing architectures without requiring significant changes to the training pipeline.

0

1

Updated 2026-05-13

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L