Formula

Bradley-Terry Model for Preference Probability

The probability that a response ya\mathbf{y}_a is preferred over another response yb\mathbf{y}_b (denoted yayb\mathbf{y}_a \succ \mathbf{y}_b), given an input x\mathbf{x}, can be modeled using a formulation based on the Bradley-Terry model. This model defines the probability as a sigmoid function of the difference between their respective reward scores, r(x,ya)r(\mathbf{x}, \mathbf{y}_a) and r(x,yb)r(\mathbf{x}, \mathbf{y}_b). The formula is: Prθ(yaybx)=Sigmoid(r(x,ya)r(x,yb))\text{Pr}_{\theta}(\mathbf{y}_a \succ \mathbf{y}_b|\mathbf{x}) = \text{Sigmoid}(r(\mathbf{x}, \mathbf{y}_a) - r(\mathbf{x}, \mathbf{y}_b)) This maps the reward difference, which can be any real number, to a valid probability between 0 and 1.

Image 0

0

1

Updated 2026-05-01

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences