Concept

Minibatch Shape Transformation for RNNs

When training a recurrent neural network, minibatches are initially sampled with the shape (batch size, number of time steps). Applying one-hot encoding to each input token transforms this minibatch into a three-dimensional tensor with the shape (batch size, number of time steps, vocabulary size). To update the hidden states efficiently time step by time step, this tensor is commonly transposed so the outermost dimension is the time step, resulting in an output shape of (number of time steps, batch size, vocabulary size).

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L