1Cademy - Binary Encoding of Pairwise Feedback in RLHF

Learn Before

Pairwise Comparison for Human Feedback in RLHF

Activity (Process)

Binary Encoding of Pairwise Feedback in RLHF

When collecting human feedback through pairwise comparisons in Reinforcement Learning from Human Feedback (RLHF), the expert's choice is converted into a binary label. For instance, if a human prefers output $\mathbf{y}_a$ over $\mathbf{y}_b$ , this preference can be encoded as a 1, while a preference for $\mathbf{y}_b$ over $\mathbf{y}_a$ would be encoded as a 0.