1Cademy - Comparison of Rejection Sampling and RLHF

Learn Before

Rejection Sampling for LLM Fine-Tuning

Comparison

Comparison of Rejection Sampling and RLHF

When compared to Reinforcement Learning from Human Feedback (RLHF), rejection sampling provides a significantly simpler method for integrating human preferences into the training of Large Language Models. It bypasses the more complex reinforcement learning loop in favor of a straightforward fine-tuning approach on reward-model-selected data.

Updated 2026-05-03

Contributors are: