1Cademy - Listwise Loss Formula from Accumulated Pairwise Comparisons

Learn Before

Listwise Loss from Accumulated Pairwise Comparisons
Modeling Pairwise Preference Probability with a Reward Function

Listwise Loss Formula from Accumulated Pairwise Comparisons

The listwise loss, derived from aggregating pairwise comparisons, is formally defined as the negative expected log-likelihood over all distinct pairs in a ranked list. The formula is:

$\mathcal{L}_{\text{list}} = -\mathbb{E}_{(\mathbf{x},Y)\sim\mathcal{D}_r}\left[\frac{1}{N(N-1)}\sum_{\substack{\mathbf{y}_a\in Y, \mathbf{y}_b\in Y \\ \mathbf{y}_a\neq \mathbf{y}_b}} \log\Pr(\mathbf{y}_a \succ \mathbf{y}_b|\mathbf{x})\right]$

Here:

$\mathcal{L}_{\text{list}}$ is the listwise loss.
The expectation $\mathbb{E}$ is taken over samples $(\mathbf{x}, Y)$ from the preference dataset $\mathcal{D}_r$ , where $Y$ is the ranked list of $N$ outputs for a prompt $\mathbf{x}$ .
The summation aggregates the log probability of the ground-truth preference for every ordered pair of distinct outputs $(\mathbf{y}_a, \mathbf{y}_b)$ within the list $Y$ .
The term $\frac{1}{N(N-1)}$ serves as a normalization factor, averaging the loss over the total number of possible ordered pairs.

0

1

14 days ago

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn After

Consider the following formula for a loss function used to train a model on ranked lists of outputs, where N is the number of items in a given list Y:

$\mathcal{L} = -\mathbb{E}\left[\frac{1}{N(N-1)}\sum_{\substack{\mathbf{y}_a\in Y, \mathbf{y}_b\in Y \\ \mathbf{y}_a\neq \mathbf{y}_b}} \log\Pr(\mathbf{y}_a \succ \mathbf{y}_b|\mathbf{x})\right]$

What is the primary analytical consequence of including the normalization term $\frac{1}{N(N-1)}$ in this calculation?
Applying the Listwise Loss Summation
Consider the listwise loss formula used for training on ranked preferences:

$\mathcal{L} = -\mathbb{E}\left[\frac{1}{N(N-1)}\sum_{\substack{\mathbf{y}_a, \mathbf{y}_b \in Y \\ \mathbf{y}_a\neq \mathbf{y}_b}} \log\Pr(\mathbf{y}_a \succ \mathbf{y}_b|\mathbf{x})\right]$

True or False: If a model is completely uncertain about the preferences within a ranked list (i.e., it assigns $\Pr(\mathbf{y}_a \succ \mathbf{y}_b|\mathbf{x}) = 0.5$ for all distinct pairs), the contribution of that specific list to the overall loss will be zero.

Learn Before

Related

Learn After