Learn Before
Importance of Variability in Pairwise Preference Data
Research indicates that having significant variability within the pairwise preference data is a key factor for successfully training Large Language Models, regardless of whether the feedback originates from humans or AI systems.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.4 Alignment - Foundations of Large Language Models
Related
Evaluation Criteria for Pairwise Comparison in RLHF
Bradley-Terry Model
Reward Model Training as a Ranking Problem in RLHF
Listwise Ranking for Human Feedback in RLHF
Importance of Variability in Pairwise Preference Data
Evaluating a Feedback Collection Strategy
A development team is refining a language model's ability to generate summaries. For each source document, they have the model produce two different summaries. They then present these two summaries side-by-side to a human annotator and ask them to select the one that is of higher quality. Which statement best analyzes the primary strength of this specific approach for collecting human feedback?
Rationale for a Feedback Collection Method
Binary Encoding of Pairwise Feedback in RLHF
Learn After
A team is training a language model using preference data from a group of 10 labelers. For each prompt, the labelers are shown two potential responses and asked to choose the better one. The team considers two data collection strategies:
- Strategy 1: The team uses a highly aligned group of labelers who almost always agree. For 95% of the prompts, at least 9 out of 10 labelers choose the same response as the 'winner'.
- Strategy 2: The team uses a more diverse group of labelers. For many prompts, there is significant disagreement, with preferences often split 6-to-4 or 7-to-3.
Based on principles of effective model training, which strategy is likely to produce a more useful dataset, and why?
Diagnosing a Model Training Plateau
Evaluating Preference Datasets