Learn Before
Partition Function in Softmax
In the formula for the softmax function, the denominator used to normalize the exponentiated terms, given by , is known as the partition function. Its logarithm is referred to as the log partition function. This terminology is borrowed from statistical physics, where the partition function represents the sum over all possible states in a thermodynamic ensemble.
0
1
Tags
D2L
Dive into Deep Learning @ D2L
Related
Softmax Function Definition
A vector of raw, unnormalized scores
[1000, 1002, 999]is passed as input to a computational function that converts these scores into a probability distribution. A common technique to prevent numerical errors is to first subtract the maximum value of the vector from every element before applying the main transformation (exponentiation). Why is this subtraction step crucial for handling large input values?Calculating Output Probabilities from Model Scores
A model outputs the following raw, unnormalized scores for three classes:
[2.0, 1.0, 0.1]. If a constant value of 5.0 is added to each of these scores, resulting in a new score vector of[7.0, 6.0, 5.1], how will the resulting probability distribution calculated by the function that converts these scores to probabilities change?Order Preservation of the Softmax Function
Energy-Based View of Softmax
Output Layer of Softmax Regression
Partition Function in Softmax
Vectorized Minibatch Softmax Regression