1Cademy - Analyzing the Impact of a Policy Divergence Penalty

Learn Before

Stabilizing Policy Updates with a Divergence Penalty

Essay

Analyzing the Impact of a Policy Divergence Penalty

When training a language model, a penalty term is often added to the learning objective to limit how much the model's behavior can change in a single update, measured against a reference version of the model. Analyze the potential trade-offs involved in this approach. Specifically, discuss the likely consequences for the training process and final model performance if the weight of this penalty is set (a) too high, and (b) too low.

Updated 2025-10-07

Contributors are:

Who are from:

Learn Before

Related