Short Answer

Evaluating Preference Datasets

An AI training team has two datasets of pairwise preference feedback for fine-tuning a language model. Both datasets have the same number of prompts.

  • Dataset X: For the vast majority of prompts, the labelers showed very high agreement, with over 95% choosing the same response as the 'winner'.
  • Dataset Y: For many prompts, the labelers showed significant disagreement, with preferences often split closer to 60/40 or 70/30.

Which dataset is likely to be more valuable for training a sophisticated and nuanced language model? Justify your choice by explaining the role of preference distribution in the training process.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science