Learn Before
Multiple Choice

A model is being trained for a listwise ranking task. For one training example, it must rank three items: Item X, Item Y, and Item Z. The correct, ground-truth ranking is X > Y > Z. The training objective is to minimize the negative log-likelihood of observing this ground-truth sequence. Which expression correctly represents the quantity to be minimized for this single training instance, where P(A | S) is the probability of choosing item A from the set of available items S?

0

1

Updated 2025-09-29

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science