1Cademy - General Language Modeling Objective based on Joint Log-Probability

Learn Before

Probability of a Concatenated Token Sequence

Concept

General Language Modeling Objective based on Joint Log-Probability

A general and foundational approach to language modeling involves treating an input x and an output y as a single, concatenated sequence. The training objective in this framework is to model the joint log-probability of the sequence, log Prθ(x, y). This is accomplished by minimizing a loss function that is calculated over all tokens in the combined sequence, [x, y], based on the chain rule of probability.

Updated 2025-10-06

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn After

Conditional vs. Joint Probability Objectives in Language Modeling
A language model is being trained with the objective of modeling the joint probability of an input sequence x and an output sequence y, which are treated as a single, concatenated sequence. During a single training step for this combined sequence, how is the model's performance error (loss) calculated?
Evaluating a Training Objective for a Base Model
A language model is being trained with the objective of modeling the joint probability of a combined sequence [x, y]. For this objective, the model's parameters are updated based only on its ability to correctly predict the tokens in the output sequence y.

Learn Before

Related

Learn After