Learn Before
Code

NiN Model Code Implementation

The programmatic implementation of the Network in Network (NiN) model instantiates the architecture using deep learning frameworks by chaining multiple NiN blocks, max-pooling layers, and a final global average pooling layer. The sequence begins with initial spatial convolutions followed by max-pooling to halve the spatial dimensions, mirroring AlexNet's initial stages. Instead of ending with parameter-heavy dense layers, the network concludes with a NiN block that outputs a channel for each target class, followed by global average pooling and a flattening operation to yield the final classification logits.

PyTorch Implementation:

class NiN(d2l.Classifier): def __init__(self, lr=0.1, num_classes=10): super().__init__() self.save_hyperparameters() self.net = nn.Sequential( nin_block(96, kernel_size=11, strides=4, padding=0), nn.MaxPool2d(3, stride=2), nin_block(256, kernel_size=5, strides=1, padding=2), nn.MaxPool2d(3, stride=2), nin_block(384, kernel_size=3, strides=1, padding=1), nn.MaxPool2d(3, stride=2), nn.Dropout(0.5), nin_block(num_classes, kernel_size=3, strides=1, padding=1), nn.AdaptiveAvgPool2d((1, 1)), nn.Flatten()) self.net.apply(d2l.init_cnn)

MXNet Implementation:

class NiN(d2l.Classifier): def __init__(self, lr=0.1, num_classes=10): super().__init__() self.save_hyperparameters() self.net = nn.Sequential() self.net.add( nin_block(96, kernel_size=11, strides=4, padding=0), nn.MaxPool2D(pool_size=3, strides=2), nin_block(256, kernel_size=5, strides=1, padding=2), nn.MaxPool2D(pool_size=3, strides=2), nin_block(384, kernel_size=3, strides=1, padding=1), nn.MaxPool2D(pool_size=3, strides=2), nn.Dropout(0.5), nin_block(num_classes, kernel_size=3, strides=1, padding=1), nn.GlobalAvgPool2D(), nn.Flatten()) self.net.initialize(init.Xavier())

JAX Implementation:

class NiN(d2l.Classifier): lr: float = 0.1 num_classes = 10 training: bool = True def setup(self): self.net = nn.Sequential([ nin_block(96, kernel_size=(11, 11), strides=(4, 4), padding=(0, 0)), lambda x: nn.max_pool(x, (3, 3), strides=(2, 2)), nin_block(256, kernel_size=(5, 5), strides=(1, 1), padding=(2, 2)), lambda x: nn.max_pool(x, (3, 3), strides=(2, 2)), nin_block(384, kernel_size=(3, 3), strides=(1, 1), padding=(1, 1)), lambda x: nn.max_pool(x, (3, 3), strides=(2, 2)), nn.Dropout(0.5, deterministic=not self.training), nin_block(self.num_classes, kernel_size=(3, 3), strides=1, padding=(1, 1)), lambda x: nn.avg_pool(x, (5, 5)), # global avg pooling lambda x: x.reshape((x.shape[0], -1)) # flatten ])

TensorFlow Implementation:

class NiN(d2l.Classifier): def __init__(self, lr=0.1, num_classes=10): super().__init__() self.save_hyperparameters() self.net = tf.keras.models.Sequential([ nin_block(96, kernel_size=11, strides=4, padding='valid'), tf.keras.layers.MaxPool2D(pool_size=3, strides=2), nin_block(256, kernel_size=5, strides=1, padding='same'), tf.keras.layers.MaxPool2D(pool_size=3, strides=2), nin_block(384, kernel_size=3, strides=1, padding='same'), tf.keras.layers.MaxPool2D(pool_size=3, strides=2), tf.keras.layers.Dropout(0.5), nin_block(num_classes, kernel_size=3, strides=1, padding='same'), tf.keras.layers.GlobalAvgPool2D(), tf.keras.layers.Flatten()])

0

1

Updated 2026-05-13

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L