Concept

Architecture and Function of the RLHF Value Model

The value model, denoted as Vω()V_\omega(\cdot) with parameters ω\omega, is a component of the RLHF framework responsible for predicting the expected cumulative future rewards from a given state. It uses the scores provided by the reward model as input for its training. Typically, the value model shares a similar architecture with the reward model, often being a Transformer decoder with a final linear layer.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences