Multiple Choice

A language model is being fine-tuned using a dataset of prompts (x), preferred responses (y_a), and dispreferred responses (y_b). The training objective is to minimize the following loss function:

L=E(x,ya,yb)[logPr(yaybx)]\mathcal{L} = -\mathbb{E}_{(x, y_a, y_b)} [\log \text{Pr}(y_a \succ y_b|x)]

In this framework, the probability that response y_a is preferred over y_b, denoted as Pr(yaybx)\text{Pr}(y_a \succ y_b|x), is computed directly from the likelihoods of each response under the current policy being trained and a fixed reference policy.

Based on this formulation, what is the most significant advantage of this training approach?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science