Learn Before
Impact of Data Distribution on Reward Model Training
A team is training a reward model using a dataset of 10,000 preference pairs. They notice that 2,000 of these pairs are for the single prompt, 'Write a story about a robot,' while the remaining 8,000 pairs are distributed across 4,000 other unique prompts. Given the standard empirical loss formula used for this training:
Analyze the most likely consequence of this data distribution on the trained reward model's behavior, and explain how the structure of the formula leads to this outcome.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Impact of Data Distribution on Reward Model Training
A researcher is training a reward model using a small preference dataset, , which contains exactly two preference pairs:
- For input , response is preferred over .
- For input , response is preferred over .
Given the empirical loss formula , which of the following expressions correctly represents the loss for this specific dataset?
Comparing Reward Model Performance