1Cademy - Example of Value Alignment: Refusing Harmful Requests

Learn Before

Aligning Large Language Models with Human Values
Ethical Challenges in LLM Alignment

Example

Example of Value Alignment: Refusing Harmful Requests

A practical application of value alignment is a model's ability to handle harmful requests. For instance, if a user asks how to build a weapon, a properly aligned, responsible model should recognize the harmful nature of the request and refuse to provide instructions. In contrast, a model that is not carefully aligned might fulfill the request, demonstrating a critical failure in ethical alignment.

Updated 2025-10-10

Contributors are: