Evaluating a Human Feedback Strategy
Critique the following proposed feedback collection strategy. Identify the most significant potential problem with this approach and recommend a more reliable alternative method, justifying why your recommendation would likely lead to more consistent data.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.4 Alignment - Foundations of Large Language Models
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Reward Model Learning in RLHF
Pairwise Comparison for Human Feedback in RLHF
Listwise Ranking for Human Feedback in RLHF
Preference Notation in Human Feedback
Pointwise Method (Rating) for Human Feedback in RLHF
Evaluating a Human Feedback Strategy
A research team is developing a system to improve a language model using feedback from a large, diverse group of non-expert annotators. The team's primary goal is to ensure the feedback data is as consistent and reliable as possible, even with minimal training for the annotators. Which of the following feedback collection strategies would best achieve this goal, and why?
Trade-offs in Human Feedback Collection Methods