Learn Before
Formula

Order Preservation of the Softmax Function

The softmax function preserves the relative ordering among its input arguments because the exponential function is strictly monotonic. Consequently, the most likely class predicted by the softmax probabilities y^\hat{\mathbf{y}} corresponds exactly to the largest raw output in o\mathbf{o}. This means we can determine the predicted class without actually computing the softmax normalization:

argmaxjy^j=argmaxjoj\operatorname*{argmax}_j \hat y_j = \operatorname*{argmax}_j o_j

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L