1Cademy - Architecture and Function of the RLHF Value Model

Learn Before

Architectural Components of an RLHF System

Concept

Architecture and Function of the RLHF Value Model

The value model, denoted as $V_\omega(\cdot)$ with parameters $\omega$ , is a component of the RLHF framework responsible for predicting the expected cumulative future rewards from a given state. It uses the scores provided by the reward model as input for its training. Typically, the value model shares a similar architecture with the reward model, often being a Transformer decoder with a final linear layer.