Evaluating a Reward Calculation Method
A language model's reward function is defined by the equation , where is the scalar reward, is the vector representation of the final token in the generated output, and is a learned weight matrix. Based on this formula, explain one significant advantage and one significant disadvantage of this approach for evaluating the quality of a generated text.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A reward model for a generative text model calculates a quality score for a given output using the formula . In this formula, is the vector representation of the final token in the generated text, and is a learned weight matrix that transforms this vector into a scalar score, . What is a primary conceptual limitation of this specific reward calculation method, especially when evaluating lengthy and complex text?
Reward Model Behavior Analysis
Evaluating a Reward Calculation Method