1Cademy - Bayesian Model Averaging for Combining Reward Models

Learn Before

Combining Multiple Reward Models to Mitigate Overoptimization

Concept

Bayesian Model Averaging for Combining Reward Models

As an alternative to simple weighted averaging, Bayesian model averaging can be used to combine predictions from an ensemble of reward models. This method aggregates the predictions by weighting each model based on its posterior probability, providing a principled way to account for model uncertainty.

Updated 2026-05-03

Contributors are: