Case Study

Evaluating Preference Model Performance with Listwise Loss

A preference model is being trained using a loss function that is calculated by summing the losses of all individual pairwise comparisons derived from a ranked list. Based on the human preference and the two model predictions below, which model would incur a higher training loss for this specific example? Justify your reasoning by explaining how the number of incorrect pairwise comparisons contributes to the total loss.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science