1Cademy - Numerical Overflow in Softmax Function

Learn Before

Softmax Function

Concept

Numerical Overflow in Softmax Function

When calculating the softmax function, $\hat y_j = \frac{\exp(o_j)}{\sum_k \exp(o_k)}$ , computational risks arise due to exponential operations. If some input logits, $o_k$ , are very large positive numbers, computing $\exp(o_k)$ can produce values that exceed the maximum limit of certain data types (such as $10^{38}$ for single-precision floating-point numbers). This phenomenon is known as numerical overflow, and it leads to mathematical instability because the resulting predicted probabilities become undefined.

Updated 2026-05-03

Contributors are:

Who are from:

References

Dive into Deep Learning

Learn Before

Related