Multiple Choice

A team is training a language model using preference data from a group of 10 labelers. For each prompt, the labelers are shown two potential responses and asked to choose the better one. The team considers two data collection strategies:

  • Strategy 1: The team uses a highly aligned group of labelers who almost always agree. For 95% of the prompts, at least 9 out of 10 labelers choose the same response as the 'winner'.
  • Strategy 2: The team uses a more diverse group of labelers. For many prompts, there is significant disagreement, with preferences often split 6-to-4 or 7-to-3.

Based on principles of effective model training, which strategy is likely to produce a more useful dataset, and why?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science