Learn Before
Concept
Fixed-Shape Cropping in Semantic Segmentation
In semantic segmentation tasks, the input image and its corresponding ground-truth label maintain a strict one-to-one spatial correspondence at the pixel level. Because of this precise alignment, simply rescaling images to fit a model's required input shape is problematic; it requires inversely rescaling the predicted pixel classes back to the original dimensions during inference, which introduces inaccuracies along the boundaries of different semantic regions. To avoid these artifacts and preserve exact pixel correspondence, the input images and their labels are typically subjected to random fixed-shape cropping rather than rescaling.
0
1
Updated 2026-05-21
Tags
D2L
Dive into Deep Learning @ D2L