1Cademy - Fast R-CNN Architecture Pipeline

Learn Before

Fast R-CNN

Concept

Fast R-CNN Architecture Pipeline

The Fast R-CNN model processes an image through four major computational steps. First, a trainable CNN extracts features from the entire image, outputting a feature map of shape $1 imes c imes h_1 imes w_1$ . Second, selective search generates $n$ region proposals, which are mapped as regions of interest on the CNN output. A region of interest (RoI) pooling layer then extracts concatenated features of a uniform shape $n imes c imes h_2 imes w_2$ from these proposals. Third, a fully connected layer transforms these features into a matrix of shape $n imes d$ . Finally, the output is transformed into a shape of $n imes q$ for object class prediction using softmax regression (where $q$ is the number of classes) and a shape of $n imes 4$ for bounding box prediction.

0

1

Updated 2026-05-21

Contributors are:

Who are from:

References

Dive into Deep Learning

Learn After

Region of Interest Pooling Layer

Learn Before

Related

Learn After