Concept

Channel Transformation in Fully Convolutional Networks

In a Fully Convolutional Network (FCN), after image features have been extracted by the backbone network, a 1×11 \times 1 convolutional layer is applied. The purpose of this layer is to transform the number of output channels from the feature extractor to match the exact number of target classes (for example, 2121 classes for the Pascal VOC2012 dataset) without altering the spatial dimensions of the feature maps.

# PyTorch num_classes = 21 net.add_module('final_conv', nn.Conv2d(512, num_classes, kernel_size=1))
# MXNet num_classes = 21 net.add(nn.Conv2D(num_classes, kernel_size=1))

0

1

Updated 2026-05-21

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L