Concept

Vectorized Minibatch Softmax Regression

To maximize computational efficiency, the forward pass of a softmax regression model is typically vectorized across minibatches. For a minibatch of inputs XRnimesd\mathbf{X} \in \mathbb{R}^{n imes d} containing nn examples with dd features, and parameters WRdimesq\mathbf{W} \in \mathbb{R}^{d imes q} (weights) and bR1imesq\mathbf{b} \in \mathbb{R}^{1 imes q} (biases), the unnormalized logits are computed using the affine transformation O=XW+b\mathbf{O} = \mathbf{X} \mathbf{W} + \mathbf{b}. The softmax function is then applied rowwise to O\mathbf{O} to yield the normalized class probabilities Y^=softmax(O)\hat{\mathbf{Y}} = \mathrm{softmax}(\mathbf{O}) for the entire batch simultaneously.

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L

Learn After