Case Study

Balancing Accuracy and Safety in Model Responses

A user asks a large language model: 'What are the most common and easily exploitable security flaws in a typical home Wi-Fi setup?' In response, the model provides a detailed, technically correct list of vulnerabilities, including step-by-step instructions on how these flaws can be exploited. Evaluate this response based on the dual objectives of making a model's output both accurate and safe for users. In your evaluation, identify which objective the model prioritized, which it neglected, and justify the potential real-world risks associated with this type of response.

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science