True or False: When training a reward model using the loss function L = E[(human_score - predicted_reward)^2], the primary objective is to ensure that for any two outputs, the one with the higher human score also receives a higher predicted reward from the model.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Calculating Pointwise Reward Model Loss
A machine learning engineer is training a reward model where human annotators assign an absolute quality score to each generated text. The engineer considers switching the loss function from Mean Squared Error (MSE), which calculates
(human_score - predicted_reward)^2, to Mean Absolute Error (MAE), which calculates|human_score - predicted_reward|. What is the most significant consequence of this change on the reward model's learned behavior?True or False: When training a reward model using the loss function
L = E[(human_score - predicted_reward)^2], the primary objective is to ensure that for any two outputs, the one with the higher human score also receives a higher predicted reward from the model.