Debugging a Recipe-Generating Language Model
Based on the principle of aggregating rewards from individual segments, propose a change to the reward system to more effectively penalize the specific type of error described in the case study. Explain why your proposed change would be more effective than the current single-score system.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Objective Function for Policy Learning in RLHF
A language model generates a response that is evaluated by breaking it into four distinct segments. A reward function assigns a score to each segment based on its quality. The scores for the segments are: Segment 1: +1.2, Segment 2: -0.5, Segment 3: +0.8, and Segment 4: -0.2. If the total reward for the entire response is calculated by summing the rewards of its individual segments, what is the total reward?
A language model generates a three-paragraph summary of a research paper. The first paragraph accurately introduces the paper's objective. The second paragraph correctly describes the methodology but contains a significant factual error about the main finding. The third paragraph draws a logical, but ultimately incorrect, conclusion based on the error in the second paragraph. If the total quality score for the summary is calculated as the sum of scores from each paragraph (segment), which segment is most likely to receive the lowest score?
Debugging a Recipe-Generating Language Model