Learn Before
Concept

Numerical Overflow in Softmax Function

When calculating the softmax function, y^j=exp⁡(oj)∑kexp⁡(ok)\hat y_j = \frac{\exp(o_j)}{\sum_k \exp(o_k)}, computational risks arise due to exponential operations. If some input logits, oko_k, are very large positive numbers, computing exp⁡(ok)\exp(o_k) can produce values that exceed the maximum limit of certain data types (such as 103810^{38} for single-precision floating-point numbers). This phenomenon is known as numerical overflow, and it leads to mathematical instability because the resulting predicted probabilities become undefined.

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L

Related