Case Study

Reward Model Score Adjustment

Based on the provided scenario, explain how the training process will adjust the reward model's scores for Completion X and Completion Y. Describe the principle guiding this adjustment.

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science