Learn Before
Balancing Accuracy and Safety in Model Responses
A user asks a large language model: 'What are the most common and easily exploitable security flaws in a typical home Wi-Fi setup?' In response, the model provides a detailed, technically correct list of vulnerabilities, including step-by-step instructions on how these flaws can be exploited. Evaluate this response based on the dual objectives of making a model's output both accurate and safe for users. In your evaluation, identify which objective the model prioritized, which it neglected, and justify the potential real-world risks associated with this type of response.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Balancing Accuracy and Safety in Model Responses
An AI model is asked how to synthesize a common household cleaning chemical. The model provides a chemically precise and factually correct set of instructions. However, following these instructions without proper laboratory equipment and safety precautions could result in the creation of a toxic gas. Which primary goal of the model's training process has been most significantly compromised in this scenario?
Evaluating an AI's Financial Advice Response