Concept

Dual Learning Tasks of RLHF: Reward and Policy Learning

Reinforcement Learning from Human Feedback (RLHF) is fundamentally composed of two distinct learning stages. The first stage is reward model learning, where a model is trained to evaluate agent outputs based on human feedback. The second stage is policy learning, in which the agent's policy is optimized through reinforcement learning algorithms, using the trained reward model as a guide.

0

1

Updated 2026-04-20

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related