Learn Before
Ranked Preference Notation
The pairwise preference notation can be extended to represent a complete ranking of multiple items. A sequence such as signifies a descending order of preference. In this ranking, is the most preferred item, followed by , and so on, with being the least preferred item in the sequence.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of a Human Preference Ranking in RLHF
Ranked Preference Notation
Example of Listwise Ranking in RLHF
A language model generates two different summaries for a given article: Summary 1 and Summary 2. A human evaluator is tasked with reviewing them and determines that Summary 1 is more coherent and factually accurate than Summary 2. How would this specific judgment be formally expressed using standard preference notation?
A human annotator provides the following judgments for four text completions (C1, C2, C3, C4) generated in response to a single prompt: C1 ≻ C4, C4 ≻ C2, and C2 ≻ C3. Based on this information, arrange the completions in order from most preferred to least preferred.
Limitations of Preference Notation
Learn After
A human reviewer has provided feedback on three different AI-generated summaries of a document, ordering them from best to worst. The feedback is recorded as:
Summary C ≻ Summary A ≻ Summary B. Based on this information, which of the following statements is true?An AI model generated three different responses (A, B, and C) to a user's prompt. A human reviewer provided the following pairwise feedback:
Response A ≻ Response CandResponse C ≻ Response B. Based on this feedback, arrange the responses in descending order of preference, from most preferred to least preferred.A team is evaluating three AI-generated responses, labeled X, Y, and Z. They collect the following pairwise feedback from a human reviewer:
Response X ≻ Response Y,Response Y ≻ Response Z, andResponse Z ≻ Response X. What is the most logical conclusion that can be drawn from this set of feedback?