Learn Before
Plackett-Luce Loss Formula
Given the log-probability of a ground-truth ordered list conditioned on an input , the loss function based on the Plackett-Luce model is defined as the expected negative log-probability over the preference dataset . The formula is:

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Plackett-Luce Loss Formula
A model is being trained for a listwise ranking task. For one training example, it must rank three items: Item X, Item Y, and Item Z. The correct, ground-truth ranking is X > Y > Z. The training objective is to minimize the negative log-likelihood of observing this ground-truth sequence. Which expression correctly represents the quantity to be minimized for this single training instance, where P(A | S) is the probability of choosing item A from the set of available items S?
Analyzing Model Error with Plackett-Luce Loss
In a listwise ranking task, if the training objective is to minimize the negative log-likelihood of the ground-truth ranked sequences, a decrease in the loss value over training epochs signifies that the model is assigning a lower probability to the correct sequences.
Learn After
Analysis of Ranking Error Penalties
A language model is being trained on a preference dataset. For a single input prompt, the ground-truth ranked sequence of responses is
Y. The model calculates the probability of observing this exact sequence asPr(Y|x) = 0.25. Based on the formula for the objective function that maximizes the likelihood of the model predicting the correct rankings, what is the loss value for this single data point?Model Performance Evaluation using Plackett-Luce Loss