Short Answer

Analysis of a Weighted Ranking Loss

A team is training a reward model using preference data. They are considering two different loss formulations for a single data sample consisting of a prompt x, a preferred response y_pref, and a rejected response y_rej.

Loss Formulation A: Loss = -log(Sigmoid(R(x, y_pref) - R(x, y_rej)))

Loss Formulation B: Loss = -Pr(y_pref ≻ y_rej | x) * log(Sigmoid(R(x, y_pref) - R(x, y_rej)))

Explain the practical implication of including the Pr(y_pref ≻ y_rej | x) term in Loss Formulation B. How does this term change the way the model learns from the preference data compared to using Loss Formulation A?

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science