Learn Before
Concept

LeNet-5 Convolutional Block

The convolutional encoder in LeNet-5 consists of two repeated units, each containing three operations: a convolutional layer, a sigmoid activation function, and an average pooling layer. Each convolutional layer uses a 5imes55 imes 5 kernel. The first convolutional layer produces 66 output channels, while the second produces 1616. After each convolution and activation, a 2imes22 imes 2 average pooling operation with stride 22 halves both the height and width of the representation, reducing the spatial dimensionality by a factor of 44 per pooling step. The convolutional block maps spatially arranged inputs to a progressively increasing number of two-dimensional feature maps while decreasing spatial resolution. Its output has shape (batch size, number of channels, height, width).

Note that while ReLU and max-pooling perform better in modern networks, they had not yet been discovered at the time LeNet was designed.

0

1

Updated 2026-05-12

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L