1Cademy - Example of Misalignment in Instruction-Following

Learn Before

Definition of LLM Alignment

Example

Example of Misalignment in Instruction-Following

An example of misalignment occurs when an LLM, asked how to hack a computer, provides instructions for the illegal activity. Although this response technically follows the user's instruction, a properly aligned model would instead refuse the harmful request and explain the negative consequences. This scenario highlights the critical difference between simple instruction-following and genuine alignment with human values and safety principles.

Updated 2026-05-01

Contributors are: