Learn Before
Concept
Fast R-CNN Architecture Pipeline
The Fast R-CNN model processes an image through four major computational steps. First, a trainable CNN extracts features from the entire image, outputting a feature map of shape . Second, selective search generates region proposals, which are mapped as regions of interest on the CNN output. A region of interest (RoI) pooling layer then extracts concatenated features of a uniform shape from these proposals. Third, a fully connected layer transforms these features into a matrix of shape . Finally, the output is transformed into a shape of for object class prediction using softmax regression (where is the number of classes) and a shape of for bounding box prediction.
0
1
Updated 2026-05-21
Tags
D2L
Dive into Deep Learning @ D2L