Learn Before
Rationale for a Feedback Collection Method
A team is developing a system to generate creative story ideas. To gather human feedback, they could ask evaluators to rate each story idea on a scale of 1 to 10. Alternatively, they could present two story ideas at a time and ask the evaluator to simply choose which one is better. Explain a significant advantage of the second approach (choosing the better of two) over the first (assigning a numerical score).
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Evaluation Criteria for Pairwise Comparison in RLHF
Bradley-Terry Model
Reward Model Training as a Ranking Problem in RLHF
Listwise Ranking for Human Feedback in RLHF
Importance of Variability in Pairwise Preference Data
Evaluating a Feedback Collection Strategy
A development team is refining a language model's ability to generate summaries. For each source document, they have the model produce two different summaries. They then present these two summaries side-by-side to a human annotator and ask them to select the one that is of higher quality. Which statement best analyzes the primary strength of this specific approach for collecting human feedback?
Rationale for a Feedback Collection Method
Binary Encoding of Pairwise Feedback in RLHF