1Cademy - Direct Computation of Output Sequence Log-Probability in LLMs

Learn Before

Mathematical Formulation of LLM Inference

Formula

Direct Computation of Output Sequence Log-Probability in LLMs

In common implementations of Large Language Models (LLMs), the log-probability of the input sequence does not need to be computed. Instead, the model directly computes the conditional log-probability of the output sequence given the input. This is done by summing the log-probabilities of each individual output token. The formula is:

$\log \Pr(\mathbf{y}|\mathbf{x}) = \sum_{i=1}^{n} \log \Pr(y_i|\mathbf{x},\mathbf{y}_{<i})$

In this notation, $[\mathbf{x},\mathbf{y}_{<i}]$ represents the context used for predicting the token $y_i$ . Furthermore, the expression $\Pr(y_i|\mathbf{x},\mathbf{y}_{<i})$ is a common literature shorthand used to denote $\Pr(y_i|[\mathbf{x},\mathbf{y}_{<i}])$ .

0

1

Updated 2026-05-03

Contributors are:

Who are from:

References

Learn Before

Related

Learn After