Formula

Log-Likelihood of a Sequence

The log-likelihood of a sequence of tokens, denoted as logPr(x)\log \text{Pr}(\mathbf{x}), is a fundamental metric used in training and evaluating language models. It is calculated by applying the logarithm to the chain rule of probability, which transforms the product of conditional probabilities into a more numerically stable sum. For a sequence x=(x0,...,xm)\mathbf{x} = (x_0, ..., x_m), the log-likelihood is computed as: logPr(x0,...,xm)=i=0mlogPr(xix0,...,xi1)\log \text{Pr}(x_0, ..., x_m) = \sum_{i=0}^{m} \log \text{Pr}(x_i|x_0, ..., x_{i-1}) This calculation is performed for each sequence within a training dataset to determine the overall likelihood of the data given the model.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.5 Inference - Foundations of Large Language Models