1Cademy - Grouped Convolution

Learn Before

Multi-Output Channel Convolution Kernel Structure

Concept

Grouped Convolution

In a convolutional layer, breaking up a standard convolution from $c_\textrm{i}$ input channels to $c_\textrm{o}$ output channels into $g$ independent groups is known as grouped convolution. In this operation, the input channels are divided into $g$ groups of size $c_\textrm{i}/g$ , which independently generate $g$ outputs of size $c_\textrm{o}/g$ . This approach proportionally reduces the computational cost from $\mathcal{O}(c_\textrm{i} \cdot c_\textrm{o})$ to $\mathcal{O}(c_\textrm{i} \cdot c_\textrm{o} / g)$ , making it $g$ times faster. Additionally, the number of parameters required for the operation decreases from a single $c_\textrm{i} \times c_\textrm{o}$ matrix to $g$ smaller matrices of size $(c_\textrm{i}/g) \times (c_\textrm{o}/g)$ , yielding a $g$ times reduction in parameters. It is assumed that both $c_\textrm{i}$ and $c_\textrm{o}$ are fully divisible by $g$ .