1Cademy - Probability Normalization over a Candidate Set

Learn Before

Notational Convention for Autoregressive Conditional Probability

Formula

Probability Normalization over a Candidate Set

The conditional probability of a specific token, given a context, can be determined by normalizing its score against the scores of other tokens. This is achieved by dividing the score of the target token $y_i$ by the sum of the scores for all tokens $y_j$ within a defined candidate set $V_i$ . This method ensures the resulting probabilities for all tokens in the set sum to 1. The general formula is: $\text{Pr}(y_i | \text{context}) = \frac{\text{score}(y_i)}{\sum_{y_j \in V_i} \text{score}(y_j)}$