Listwise Loss Formula from Accumulated Pairwise Comparisons
The listwise loss, formulated by accumulating pairwise comparisons, is defined as the negative expected log-probability over all distinct pairs within a ranked list. The formula is:
Here, denotes the listwise loss. The expectation is taken over samples from the preference dataset , where is a ranked list of outputs for a given prompt . The summation aggregates the log conditional probability of the preference for every ordered pair of distinct outputs in the list . The term normalizes the sum over the total number of possible ordered pairs.

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Listwise Loss Formula from Accumulated Pairwise Comparisons
A human annotator is given four model-generated responses (A, B, C, D) to a prompt and ranks them in order of preference from best to worst as: C > A > D > B. To train a preference model, a loss function is calculated by summing the individual losses for every pairwise comparison implied by this ranking. Which of the following sets represents all the pairwise preferences that would be used in this loss calculation?
Decomposing a Ranked List into Pairwise Preferences
Evaluating Preference Model Performance with Listwise Loss
Listwise Loss Formula from Accumulated Pairwise Comparisons
Empirical Reward Model Loss Formula
Empirical Formulation of Pair-wise Ranking Loss
A system learns a function,
r(input, response), that assigns a numerical score indicating the quality of aresponsefor a giveninput. The probability that responseY_ais preferred over responseY_bis then calculated using the formula:Probability = Sigmoid(r(input, Y_a) - r(input, Y_b)), whereSigmoid(z) = 1 / (1 + e^-z). Given the following scenarios for a single input, which one presents a logical inconsistency between the assigned scores and the resulting preference probability?Preference Probability Calculation
Invariance of Preference Probability
Learn After
Consider the following formula for a loss function used to train a model on ranked lists of outputs, where
Nis the number of items in a given listY:What is the primary analytical consequence of including the normalization term in this calculation?
Applying the Listwise Loss Summation
Consider the listwise loss formula used for training on ranked preferences:
True or False: If a model is completely uncertain about the preferences within a ranked list (i.e., it assigns for all distinct pairs), the contribution of that specific list to the overall loss will be zero.