Learn Before
Evaluating a Singular Alignment Approach
Based on the following scenario, analyze the fundamental flaw in the startup's alignment strategy and explain why it led to the specific negative outcomes observed.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Critique of a Singular Alignment Strategy
A development team is aligning a new large language model. Their sole strategy is to use a reward model that gives high scores for outputs that are factually accurate and verifiable. Why is this singular focus likely to result in an inadequately aligned model?
Evaluating a Singular Alignment Approach