Critique of a Fixed AI Constitution
A leading AI development company proposes to align its advanced language model by creating a 'fixed constitution'—a set of core principles derived from a comprehensive global survey of human values conducted in the present day. The company argues that this fixed set of rules will ensure the model's behavior remains stable and predictable over time. Critically evaluate this strategy. Based on the challenges of aligning AI with human values, what is the primary long-term risk of this approach, and why?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.4 Alignment - Foundations of Large Language Models
Ch.5 Inference - Foundations of Large Language Models
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Evaluating a Global AI Moderation Strategy
An AI assistant is designed to be a 'helpful and harmless' conversational partner and is deployed globally. Soon after launch, user feedback reveals a significant issue: users in Japan tend to find the AI 'too direct and assertive,' while users in the United States often describe it as 'too passive and indirect.' What fundamental challenge in creating safe and useful AI systems does this conflicting feedback most clearly illustrate?
Critique of a Fixed AI Constitution