1Cademy - Interpreting Evaluator Preferences

Learn Before

Example of Listwise Ranking in RLHF

Short Answer

Interpreting Evaluator Preferences

A language model generates three responses to the prompt 'Explain gravity to a 5-year-old.'

Response A: 'Gravity is a fundamental interaction which causes mutual attraction between all things with mass or energy.'
Response B: 'Imagine the Earth is a giant bowling ball on a trampoline. Things like you and me are like little marbles that roll towards it. That's kind of like gravity!'
Response C: 'Gravity is the force that keeps you on the ground so you don't float away into space. It's what makes an apple fall from a tree.'

An evaluator provides the following ranking, where '≻' signifies 'is preferred over': Response B ≻ Response C ≻ Response A

Based on this ranking, what primary characteristic was the evaluator likely prioritizing? Briefly justify your answer by referencing the content of the responses.

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related