1Cademy - Correcting a Chatbots Risky Advice

Learn Before

Suitability of Fine-Tuning for Aligning with Human Values

Case Study

Correcting a Chatbot's Risky Advice

Given the scenario below, propose the most appropriate training methodology to correct the model's behavior and justify why this approach is particularly well-suited for this type of problem.

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

A technology company has developed a powerful language model by training it on a massive, diverse dataset from the public internet. During internal testing, the model demonstrates strong general knowledge but also occasionally generates biased, unhelpful, or factually incorrect content. The company's primary goal is to ensure the model's public-facing behavior consistently reflects its core values of safety, accuracy, and helpfulness. Which of the following strategies represents the most direct and effective approach for the company to achieve this specific goal?
Comparing Training Phases for Behavioral Alignment
Correcting a Chatbot's Risky Advice

Learn Before

Related