Concept

Limitations of the Pointwise Method in RLHF

Pointwise methods in RLHF face two significant challenges: a high sensitivity to variance in human feedback and a tendency toward poor generalization. The first issue arises because these methods focus on fitting absolute scores; inconsistent ratings from different annotators can therefore degrade the model's performance. The second problem occurs because training the model to match specific scores, especially with the limited datasets often used in RLHF, can prevent it from learning the broader principles of what constitutes a high-quality response.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences