Learn Before
Locality and Hardware Optimization of Convolutions
The fundamental computation of a convolutional layer—typically implemented as a cross-correlation operation—is highly local, meaning that each output element depends only on a small, contiguous region of the input. This structural locality allows for significant hardware optimization, as chip designers can prioritize fast computation units over large memory capacity or bandwidth when designing accelerators for convolutions. While this specialized hardware design might not be optimal for all types of algorithms, it provides the computational efficiency necessary for affordable and ubiquitous computer vision applications.
0
1
Tags
D2L
Dive into Deep Learning @ D2L
Related
Convolution Layer Output Size and Parameter Formulas
Limitation of Manually Designed Convolutional Kernels
Equivalence of Strict Convolution and Cross-Correlation
Feature Map
Two-Dimensional Convolutional Layer Code Implementation
Convolution Kernel and Layer Size Notation
Waldo Detector Convolution Example
Object Edge Detection Using Convolution
Locality and Hardware Optimization of Convolutions
Channel Depth in Convolutional Networks