Example

Example of a Human Preference Ranking in RLHF

In the data annotation stage of RLHF, human evaluators rank multiple model-generated outputs for a given prompt. For example, if four outputs are presented, an annotator's preference might be expressed with the ranking y1y4y2y3y_1 \succ y_4 \succ y_2 \succ y_3. This indicates that y1y_1 is the most preferred response, followed by y4y_4 and y2y_2, with y3y_3 being the least preferred.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models Course

Related