Multiple Choice

A research team is training a model to score the quality of AI-generated text. They are considering two approaches for collecting human feedback to train this scoring model:

  • Approach A: Show a human evaluator two different text outputs for the same prompt and ask them to choose which one is better. The scoring model is then trained to predict this preference.
  • Approach B: Show a human evaluator a single text output and ask them to rate its quality on a scale of 1 to 10. The scoring model is then trained to predict this specific rating.

What is the primary conceptual advantage of Approach B's framing of the learning task?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science