1Cademy - Analysis of Language Model Training Objectives

Learn Before

Conditional vs. Joint Probability Objectives in Language Modeling

Short Answer

Analysis of Language Model Training Objectives

A researcher is training a language model for a summarization task using [article, summary] pairs. They are considering two different methods for calculating the training loss:

Method 1: The loss is calculated based on the model's predictions for all tokens in the concatenated [article, summary] sequence.
Method 2: The loss is calculated based only on the model's predictions for the tokens in the summary part of the sequence.

For each method, identify whether it corresponds to optimizing a joint probability objective or a conditional probability objective. Then, explain the key difference in what the model is being trained to accomplish with each objective.

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Learn Before

Related