1Cademy - Limitations of Human Feedback for LLM Alignment

Learn Before

Human Preference Alignment via Reward Models

Concept

Limitations of Human Feedback for LLM Alignment

While aligning large language models with human preferences is a widely used and effective strategy, it comes with significant drawbacks. The process of annotating preference data is not only expensive but also faces challenges with scalability. Moreover, because human feedback is inherently subjective, it can introduce biases into the alignment process.

Updated 2025-10-06

Contributors are: