- Most commonly used to recognize handwritten and machine printed characters, e.g., recognizing the handwritten numbers on bank checks.

LeNet-5 Convolutional Neural Network Applications

- LeNet-5 is trained on 32 x 32 Gray-scale images.
- The first layer has six, 5 x 5 filters with a stride of one. The output is passed to a 2 x 2 average pooling with a stride of two.
- The second layer has 16, 5 x 5 filters that pass the results to a 2 x 2 average pooling with a stride of two.
- Two fully connected layers get the output and pass it to a softmax output layer to detect one of ten classes.

University of Michigan - Ann Arbor

There are three main classic neural network architectures:
- LeNet-5
- AlexNet
- VGG - 16

Classic Convolutional Neural Network Architectures for Object Detection in Images

LeNet-5 Convolutional Neural Network

- AlexNet is trained on 227 x 227 RGB images.
- The first layer has 96, 11 x 11 filters with a stride of four. The output is passed to a 3 x 3 max pooling with a stride of two.
- The second layer has 256, 5 x 5 same convolution filters that pass the results to a 3 x 3 max pooling with a stride of two.
- The third layer has 384, 3 x 3 same convolution filters.
- The fourth layer has 384, 3 x 3 same convolution filters.
- The fifth layer has 256, 3 x 3 same convolution filters that pass the results to a 3 x 3 max pooling with a stride of two.
- Two fully connected layers get the output and pass it to a softmax output layer to detect one of 1,000 classes.

AlexNet Convolutional Neural Network

- AlexNet is trained on 224 x 224 RGB images.
- It always uses:
   - CONV = 3 x 3 filters, s = 1, same
   - MAX-POOL = 2 x 2, s = 2
- It has about 138 M parameters to train.

Learn Before

Related

Learn After