Concept

Grouped Convolution

In a convolutional layer, breaking up a standard convolution from cic_\textrm{i} input channels to coc_\textrm{o} output channels into gg independent groups is known as grouped convolution. In this operation, the input channels are divided into gg groups of size ci/gc_\textrm{i}/g, which independently generate gg outputs of size co/gc_\textrm{o}/g. This approach proportionally reduces the computational cost from O(cico)\mathcal{O}(c_\textrm{i} \cdot c_\textrm{o}) to O(cico/g)\mathcal{O}(c_\textrm{i} \cdot c_\textrm{o} / g), making it gg times faster. Additionally, the number of parameters required for the operation decreases from a single ci×coc_\textrm{i} \times c_\textrm{o} matrix to gg smaller matrices of size (ci/g)×(co/g)(c_\textrm{i}/g) \times (c_\textrm{o}/g), yielding a gg times reduction in parameters. It is assumed that both cic_\textrm{i} and coc_\textrm{o} are fully divisible by gg.

0

1

Updated 2026-05-13

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L