Learn Before
Concept
Bounding Box Tensor Shape
In deep learning frameworks, multiple bounding boxes for a single image are programmatically represented as a two-dimensional tensor of shape (n, 4), where is the total number of bounding boxes. When processing a minibatch of images, this representation expands to a three-dimensional tensor of shape (b, n, 4), where represents the batch size. In both configurations, the final dimension of size 4 holds the numerical parameters defining each box, such as the two corner coordinates or the center position paired with width and height.
0
1
Updated 2026-06-14
Tags
D2L
Dive into Deep Learning @ D2L