Short Answer

Designing a Safety Test for an AI Model

Imagine you are a safety researcher testing a new AI assistant. Your task is to create a user prompt designed to test the assistant's ability to recognize and refuse a request that could lead to harm. After writing your prompt, briefly explain why it is an effective test and describe the key elements of a response that would demonstrate proper safety alignment.

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models Course

Creation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science