Evaluating a Language Model's Training Objective
Based on the provided scenario, explain why the chosen training objective is leading to poor performance on the intended task. In your analysis, break down the joint probability into its constituent parts and describe how the relative lengths of the input and output sequences affect the model's learning process.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is being trained on a dataset of input-output pairs
(x, y). Two different training objectives are proposed:- Objective A: Maximize the sum of
log Pr(y|x)over all pairs in the dataset. - Objective B: Maximize the sum of
log Pr(sequence)over all pairs, wheresequenceis the concatenation ofxandy.
Under which of the following conditions will optimizing for Objective B be mathematically equivalent to optimizing for Objective A?
- Objective A: Maximize the sum of
Equivalence of Training Objectives
Evaluating a Language Model's Training Objective