1Cademy - Log-Likelihood of a Sequence

Learn Before

Derivation of Sequence Log-Probability via Chain Rule

Formula

Log-Likelihood of a Sequence

The log-likelihood of a sequence of tokens, denoted as $\log \text{Pr}(\mathbf{x})$ , is a fundamental metric used in training and evaluating language models. It is calculated by applying the logarithm to the chain rule of probability, which transforms the product of conditional probabilities into a more numerically stable sum. For a sequence $\mathbf{x} = (x_0, ..., x_m)$ , the log-likelihood is computed as: $\log \text{Pr}(x_0, ..., x_m) = \sum_{i=0}^{m} \log \text{Pr}(x_i|x_0, ..., x_{i-1})$ This calculation is performed for each sequence within a training dataset to determine the overall likelihood of the data given the model.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

References

Learn Before

Related

Learn After