Critique of an Objective Function Formulation
A researcher proposes the following objective function to train a language model, where p_model(y|x) is the model's probability distribution and f(x, y) is an arbitrary scoring function that is not a probability distribution: Objective = E [log p_model(y|x) - f(x, y)]. From a conceptual standpoint, what is the primary weakness of formulating the objective in this way?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Normalizing a Function to Create a Probability Distribution
A machine learning model is being trained to generate outputs. Its behavior is described by a probability distribution
p_model(y|x), and the desired behavior is captured by a target data distributionp_data(y|x). The training process involves minimizing an objective function. Which of the following objective function structures is most desirable because it can be clearly interpreted as a measure of the 'distance' or difference between the two distributions?Critique of an Objective Function Formulation
Evaluating Objective Function Designs