1Cademy - Energy-Based View of Softmax

Learn Before

Softmax Function

Theory

Energy-Based View of Softmax

The softmax function has roots in statistical physics, closely mirroring the Boltzmann distribution where the prevalence of a thermodynamic state with energy $E$ and temperature $T$ is proportional to $\exp(-E/kT)$ . By treating the model's error (or negative logit) as energy, the softmax formulation naturally models a distribution over states, forming the conceptual basis for energy-based models in deep learning.

Updated 2026-05-03

Contributors are:

Who are from:

References

Dive into Deep Learning

Learn Before

Related