1Cademy - Comparison of Objectives: Supervised Fine-Tuning vs. RLHF

Learn Before

Reinforcement Learning from Human Feedback (RLHF)

Comparison

Comparison of Objectives: Supervised Fine-Tuning vs. RLHF

Supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) represent two distinct methodologies for training large language models. In supervised fine-tuning, the language model is optimized by maximizing the probability of the prediction given the input. In contrast, RLHF first trains a reward model on human preference data, where evaluators select their preferred choice from pairs of model predictions. Then, this reward model is utilized to supervise the language model during the fine-tuning process by scoring newly generated outputs and updating the model parameters through reinforcement learning algorithms.

Updated 2026-05-01

Contributors are:

Who are from:

References

Learn Before

Related

Learn After