Activity (Process)

Binary Encoding of Pairwise Feedback in RLHF

When collecting human feedback through pairwise comparisons in Reinforcement Learning from Human Feedback (RLHF), the expert's choice is converted into a binary label. For instance, if a human prefers output ya\mathbf{y}_a over yb\mathbf{y}_b, this preference can be encoded as a 1, while a preference for yb\mathbf{y}_b over ya\mathbf{y}_a would be encoded as a 0.

0

1

Updated 2026-05-01

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences