Learn Before
Formula

Plackett-Luce Loss Formula

Given the log-probability logPr(Y˚x)\log \Pr(\mathring{Y} | \mathbf{x}) of a ground-truth ordered list Y˚\mathring{Y} conditioned on an input x\mathbf{x}, the loss function based on the Plackett-Luce model is defined as the expected negative log-probability over the preference dataset Dr\mathcal{D}_r. The formula is: Lpl=E(x,Y˚)Dr[logPr(Y˚x)]\mathcal{L}_{\mathrm{pl}} = -\mathbb{E}_{(\mathbf{x},\mathring{Y}) \sim \mathcal{D}_r} \big[ \log \Pr(\mathring{Y} | \mathbf{x}) \big]

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences