1Cademy - A team is training a language model using preference data from a group of 10 labelers. For each prompt, the labelers are shown two potential responses and asked to choose the better one. The team considers two data collection strategies:<br><br>* **Strategy 1:** The team uses a highly aligned group of labelers who almost always agree. For 95% of the prompts, at least 9 out of 10 labelers choose the same response as the winner.<br>* **Strategy 2:** The team uses a more diverse group of labelers. For

Learn Before

Importance of Variability in Pairwise Preference Data

Multiple Choice

A team is training a language model using preference data from a group of 10 labelers. For each prompt, the labelers are shown two potential responses and asked to choose the better one. The team considers two data collection strategies:

Strategy 1: The team uses a highly aligned group of labelers who almost always agree. For 95% of the prompts, at least 9 out of 10 labelers choose the same response as the 'winner'.
Strategy 2: The team uses a more diverse group of labelers. For

Updated 2025-09-28

Contributors are:

Who are from:

Learn Before

Related