Learn Before
Interpreting Evaluator Preferences
A language model generates three responses to the prompt 'Explain gravity to a 5-year-old.'
- Response A: 'Gravity is a fundamental interaction which causes mutual attraction between all things with mass or energy.'
- Response B: 'Imagine the Earth is a giant bowling ball on a trampoline. Things like you and me are like little marbles that roll towards it. That's kind of like gravity!'
- Response C: 'Gravity is the force that keeps you on the ground so you don't float away into space. It's what makes an apple fall from a tree.'
An evaluator provides the following ranking, where '≻' signifies 'is preferred over': Response B ≻ Response C ≻ Response A
Based on this ranking, what primary characteristic was the evaluator likely prioritizing? Briefly justify your answer by referencing the content of the responses.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A team is refining a language model. For a single user prompt, the model generates four distinct responses: Response 1, Response 2, Response 3, and Response 4. A human evaluator is tasked with ordering these responses from best to worst. The evaluator concludes that Response 3 is the most helpful. Response 1 is the second-best, followed by Response 4. Response 2 is deemed the least helpful. Using the notation where '≻' signifies 'is preferred over,' which option correctly represents the evaluator's complete ranking?
A human evaluator was asked to rank three different responses (Response A, Response B, Response C) generated by a language model for the same prompt. Match each formal preference notation with the correct description of the evaluator's ranking.
Interpreting Evaluator Preferences