Learn Before
Reward Model Loss Calculation
An engineer is training two reward models, Model A and Model B, using a segment-based rating loss function. For a specific text segment, the human-provided target score is 4.0. Model A predicts a score of 3.9 for this segment, while Model B predicts a score of 3.0. Based on the principles of the rating loss function, which model will receive a stronger corrective signal (i.e., a larger loss value to minimize) for this specific segment, and why?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Unit Reward Function for Segments
Reward Model Loss Calculation
A reward model is being trained to score segments of a generated text. The training objective is to maximize a loss function defined as the negative mean squared error between the model's predicted scores and the provided target scores for each segment. If, during training, the calculated loss for a batch of segments is a value very close to zero (e.g., -0.001), what does this indicate about the model's performance on that specific batch?
Behavior of the Rating Loss Function