Multiple Choice

A research team is developing a reward model to score the quality of AI-generated poetry. Their team of human labelers consists of literary experts from diverse cultural backgrounds, leading to highly subjective and varied opinions on what constitutes 'good' poetry. Given this context, which of the following methods for collecting human feedback would likely introduce the most noise and inconsistency into the reward model's training data?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.2 Generative Models - Foundations of Large Language Models

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science