Reward Model Behavior Analysis
Based on the provided formula for the reward model, explain the most likely reason why the contradictory response described in the case study received a high reward.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A reward model for a generative text model calculates a quality score for a given output using the formula . In this formula, is the vector representation of the final token in the generated text, and is a learned weight matrix that transforms this vector into a scalar score, . What is a primary conceptual limitation of this specific reward calculation method, especially when evaluating lengthy and complex text?
Reward Model Behavior Analysis
Evaluating a Reward Calculation Method