1Cademy - Differing Motivations of Instruction and Human Preference Alignment

Dataset 1: A curated set of prompts, each paired with a single, ideal, human-written response that demonstrates how to follow the prompt&#x27;s instructions correctly.
Dataset 2: A set of prompts where, for each prompt, a human evaluator has ranked several different model-generated responses from best to worst.

Learn Before

Fundamental Approaches to LLM Alignment

Comparison

Differing Motivations of Instruction and Human Preference Alignment

Instruction alignment and human preference alignment are driven by distinct goals. The primary motivation for instruction alignment is to make a model generate outputs that adhere closely to explicit user commands, whereas human preference alignment is motivated by the need to train a model based on broader, often implicit, human feedback.

Updated 2026-04-30

Contributors are: