Learn Before
Concept

Inception Block Structure

The fundamental convolutional block in the GoogLeNet architecture is the Inception block. It consists of four parallel branches that process the input to extract information at different spatial scales. The first branch uses a 1Ă—11 \times 1 convolutional layer. The second and third branches start with a 1Ă—11 \times 1 convolution to reduce the number of channels and model complexity, followed by 3Ă—33 \times 3 and 5Ă—55 \times 5 convolutions, respectively. The fourth branch applies a 3Ă—33 \times 3 max-pooling layer followed by a 1Ă—11 \times 1 convolutional layer to adjust channel counts. All branches use appropriate padding to ensure the spatial dimensions (height and width) of the input and output remain identical. Finally, the outputs from these four branches are concatenated along the channel dimension to form the block's output.

Image 0

0

1

Updated 2026-05-13

Tags

Data Science

D2L

Dive into Deep Learning @ D2L

Learn After