A machine learning model is being trained to generate outputs. Its behavior is described by a probability distribution p_model(y|x), and the desired behavior is captured by a target data distribution p_data(y|x). The training process involves minimizing an objective function. Which of the following objective function structures is most desirable because it can be clearly interpreted as a measure of the 'distance' or difference between the two distributions?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Normalizing a Function to Create a Probability Distribution
A machine learning model is being trained to generate outputs. Its behavior is described by a probability distribution
p_model(y|x), and the desired behavior is captured by a target data distributionp_data(y|x). The training process involves minimizing an objective function. Which of the following objective function structures is most desirable because it can be clearly interpreted as a measure of the 'distance' or difference between the two distributions?Critique of an Objective Function Formulation
Evaluating Objective Function Designs