Preference Notation in Human Feedback
In the context of human feedback for language models, the notation is used to formally represent a preference. It signifies that a human annotator has judged output to be of higher quality or more desirable than output .

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Reward Model Learning in RLHF
Pairwise Comparison for Human Feedback in RLHF
Listwise Ranking for Human Feedback in RLHF
Preference Notation in Human Feedback
Pointwise Method (Rating) for Human Feedback in RLHF
Evaluating a Human Feedback Strategy
A research team is developing a system to improve a language model using feedback from a large, diverse group of non-expert annotators. The team's primary goal is to ensure the feedback data is as consistent and reliable as possible, even with minimal training for the annotators. Which of the following feedback collection strategies would best achieve this goal, and why?
Trade-offs in Human Feedback Collection Methods
Learn After
Example of a Human Preference Ranking in RLHF
Ranked Preference Notation
Example of Listwise Ranking in RLHF
A language model generates two different summaries for a given article: Summary 1 and Summary 2. A human evaluator is tasked with reviewing them and determines that Summary 1 is more coherent and factually accurate than Summary 2. How would this specific judgment be formally expressed using standard preference notation?
A human annotator provides the following judgments for four text completions (C1, C2, C3, C4) generated in response to a single prompt: C1 ≻ C4, C4 ≻ C2, and C2 ≻ C3. Based on this information, arrange the completions in order from most preferred to least preferred.
Limitations of Preference Notation