Concept

Overlapping Crop Prediction in Fully Convolutional Networks

In semantic segmentation, test images often vary in size and shape. Because a fully convolutional network typically upsamples feature maps using a transposed convolutional layer with a specific stride (e.g., 3232), input images with dimensions not evenly divisible by this stride will yield output feature maps that deviate from the input shape. To circumvent this issue without distorting the image through resizing, prediction can be performed by cropping multiple rectangular areas from the input—each having dimensions that are integer multiples of the stride—such that their union completely covers the entire image. Forward propagation is executed on each crop separately. For any pixel that is covered by multiple overlapping crops, the network's transposed convolution outputs from all overlapping areas for that pixel are averaged prior to applying the softmax operation to predict its final class.

0

1

Updated 2026-05-21

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L