Learn Before
Aggregated Reward as the Sum of Segment-Based Rewards
The total reward for a given input \mathbf{x} and a generated sequence \mathbf{y}, denoted as r(\mathbf{x}, \mathbf{y}), can be calculated by summing the individual rewards of its n constituent segments. This aggregation method is defined by the formula: Here, r(\mathbf{x}, \mathbf{y}, \bar{\mathbf{y}}_k) represents the reward function for the k-th segment. This segment-level reward can depend on the initial input, the entire output sequence, and an average value \bar{\mathbf{y}}_k associated with that specific segment.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
A language model is given the input prompt, 'Write a short poem about a rainy day.' It generates the response, 'The sky weeps, and the world listens.' A separate evaluation model then assesses this response for the given prompt and assigns it a quality score of 9.2. If this evaluation process is represented by the function , which option correctly assigns the elements of this scenario to the function's variables?
In the context of evaluating a language model's output, a function is commonly expressed as . Match each component of this notation to its correct description.
Reward Function as a Linear Transformation of the Last Hidden State
Aggregated Reward as the Sum of Segment-Based Rewards
Interpreting Reward Model Notation
Learn After
Objective Function for Policy Learning in RLHF
A language model generates a response that is evaluated by breaking it into four distinct segments. A reward function assigns a score to each segment based on its quality. The scores for the segments are: Segment 1: +1.2, Segment 2: -0.5, Segment 3: +0.8, and Segment 4: -0.2. If the total reward for the entire response is calculated by summing the rewards of its individual segments, what is the total reward?
A language model generates a three-paragraph summary of a research paper. The first paragraph accurately introduces the paper's objective. The second paragraph correctly describes the methodology but contains a significant factual error about the main finding. The third paragraph draws a logical, but ultimately incorrect, conclusion based on the error in the second paragraph. If the total quality score for the summary is calculated as the sum of scores from each paragraph (segment), which segment is most likely to receive the lowest score?
Debugging a Recipe-Generating Language Model