Learn Before
Comparison

Comparison of Process and Outcome Reward Models

Process Reward Models (PRMs) differ from Outcome Reward Models (ORMs) in the granularity of their feedback. PRMs offer a detailed, fine-grained supervisory signal by evaluating every intermediate step of a reasoning process, whereas ORMs provide a coarser signal by assessing only the final outcome.

0

1

Updated 2026-05-06

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences