1Cademy - Rationale for Freezing the Reference Model in RLHF

Learn Before

Model Initialization Strategy in RLHF

Essay

Rationale for Freezing the Reference Model in RLHF

During the initialization stage of the Reinforcement Learning from Human Feedback (RLHF) process, the reference model's parameters are fixed and are not updated during subsequent training. Analyze the primary reason for this decision and predict the potential negative outcome if this model's parameters were allowed to be updated alongside the policy model.

Updated 2025-10-05

Contributors are:

Who are from:

Learn Before

Related