Concept

Insufficiency of Data Fitting for Aligning with Human Values

Aligning LLMs with human values requires more than simply fitting the model to a limited dataset of annotated examples. Such datasets are often insufficient to capture the full spectrum of desired behaviors. The fundamental goal is not just to replicate specific outputs, but to instill in the model a deeper capability to discern which responses are more aligned with human preferences in general.

0

1

Updated 2026-01-15

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences