Learn Before
Applying Batch Normalization to LeNet
Batch normalization can be integrated into the classic LeNet-5 architecture by inserting a batch normalization layer after each convolutional or fully connected layer but before the corresponding activation function. The resulting network, sometimes called BNLeNet, retains the same layer progression as the original LeNet—two convolutional blocks (each followed by sigmoid activation and average pooling) and three fully connected layers with , , and output units—but places a batch normalization operation at every layer that produces learnable features. For the convolutional layers, the batch normalization operates in four-dimensional mode (over the channel dimension across all spatial locations), while for the fully connected layers it operates in two-dimensional mode (over the feature dimension). This architecture can be implemented either from scratch using a custom batch normalization class, or concisely using built-in high-level API layers (such as nn.LazyBatchNorm2d and nn.LazyBatchNorm1d in PyTorch). The concise version produces virtually identical code but eliminates the need to manually specify dimensionality arguments. Training this modified network on the Fashion-MNIST dataset with a batch size of and a learning rate of for epochs demonstrates how batch normalization integrates seamlessly into existing architectures without requiring significant changes to the training pipeline.
0
1
Tags
D2L
Dive into Deep Learning @ D2L