1Cademy - Auto-regressive Decomposition of Conditional Log-Likelihood

Learn Before

Fine-Tuning Objective as Log-Likelihood Maximization

Formula

Auto-regressive Decomposition of Conditional Log-Likelihood

The conditional log-likelihood, often used as an objective function in sequence modeling, is computed by decomposing the probability of the entire output sequence $\mathbf{y}$ into a product of conditional probabilities for each token. In log space, this product becomes a sum. Specifically, the log-probability of sequence $\mathbf{y}$ given input $\mathbf{x}$ is the sum of the log-probabilities of each token $y_i$ , conditioned on the input $\mathbf{x}$ and all previously generated tokens $\mathbf{y}_{<i}$ . The formula, parameterized by $\theta$ , is: $\log \text{Pr}_{\theta}(\mathbf{y}|\mathbf{x}) = \sum_{i=1}^{n} \log \text{Pr}_{\theta}(y_i|\mathbf{x}, \mathbf{y}_{<i})$ where $n$ is the length of the sequence $\mathbf{y}$ .

0

1

Updated 2026-05-02

Contributors are:

Who are from:

References

Learn Before

Related

Learn After