Essay

Rationale for Freezing the Reference Model in RLHF

During the initialization stage of the Reinforcement Learning from Human Feedback (RLHF) process, the reference model's parameters are fixed and are not updated during subsequent training. Analyze the primary reason for this decision and predict the potential negative outcome if this model's parameters were allowed to be updated alongside the policy model.

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.4 Alignment - Foundations of Large Language Models

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science