Formula

Listwise Loss Formula from Accumulated Pairwise Comparisons

The listwise loss, formulated by accumulating pairwise comparisons, is defined as the negative expected log-probability over all distinct pairs within a ranked list. The formula is:

Llist=E(x,Y)Dr[1N(N1)yaY,ybYyayblogPr(yaybx)]\mathcal{L}_{\mathrm{list}} = -\mathbb{E}_{\substack{(\mathbf{x},Y) \sim \mathcal{D}_r}} \Big[ \frac{1}{N(N-1)} \sum_{\substack{\mathbf{y}_a \in Y, \mathbf{y}_b \in Y \\ \mathbf{y}_a \ne \mathbf{y}_b}} \log \mathrm{Pr}(\mathbf{y}_a \succ \mathbf{y}_b | \mathbf{x}) \Big]

Here, Llist\mathcal{L}_{\mathrm{list}} denotes the listwise loss. The expectation E\mathbb{E} is taken over samples (x,Y)(\mathbf{x},Y) from the preference dataset Dr\mathcal{D}_r, where YY is a ranked list of NN outputs for a given prompt x\mathbf{x}. The summation aggregates the log conditional probability of the preference for every ordered pair of distinct outputs (ya,yb)(\mathbf{y}_a, \mathbf{y}_b) in the list YY. The term 1N(N1)\frac{1}{N(N-1)} normalizes the sum over the total number of possible ordered pairs.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences