Equivalence of Training Objectives
A language model can be trained by maximizing the joint log-likelihood of a concatenated input-output sequence, log Pr(x, y). This is often treated as equivalent to maximizing the conditional log-likelihood, log Pr(y|x). Explain the mathematical reasoning and the specific condition required for these two objectives to be equivalent in terms of finding the optimal model parameters.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is being trained on a dataset of input-output pairs
(x, y). Two different training objectives are proposed:- Objective A: Maximize the sum of
log Pr(y|x)over all pairs in the dataset. - Objective B: Maximize the sum of
log Pr(sequence)over all pairs, wheresequenceis the concatenation ofxandy.
Under which of the following conditions will optimizing for Objective B be mathematically equivalent to optimizing for Objective A?
- Objective A: Maximize the sum of
Equivalence of Training Objectives
Evaluating a Language Model's Training Objective