1Cademy - Using Off-the-Shelf LLMs as Reward Models

Learn Before

Combining Multiple Reward Models to Mitigate Overoptimization

Concept

Using Off-the-Shelf LLMs as Reward Models

A simple and practical strategy for creating reward models is to use existing, well-developed Large Language Models (LLMs) with little to no modification. This 'off-the-shelf' approach leverages the strong generalization capabilities of these models. Using open-source or commercial LLMs as reward models has proven to be a powerful and effective method for aligning other LLMs, in some cases achieving state-of-the-art performance on popular tasks.