Learn Before
Concept
Minibatch Shape Transformation for RNNs
When training a recurrent neural network, minibatches are initially sampled with the shape (batch size, number of time steps). Applying one-hot encoding to each input token transforms this minibatch into a three-dimensional tensor with the shape (batch size, number of time steps, vocabulary size). To update the hidden states efficiently time step by time step, this tensor is commonly transposed so the outermost dimension is the time step, resulting in an output shape of (number of time steps, batch size, vocabulary size).
0
1
Updated 2026-05-14
Tags
D2L
Dive into Deep Learning @ D2L