Concept

Preference Models as a Sequential Step for Generalization

To overcome the generalization limits of standard instruction fine-tuning and better capture complex human preferences, preference models are often employed as an additional fine-tuning stage. Applying this step after initial instruction fine-tuning allows Large Language Models to generalize further beyond explicit instruction-response mappings.

0

1

Updated 2026-05-01

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences