Learn Before
Evaluating Model Training Objectives
A company develops a chatbot for customer support. To improve its efficiency, they train it on a dataset where the 'best' responses are those that lead to the quickest conversation conclusion. After deployment, they observe that the chatbot is very fast at ending conversations, but customer satisfaction has plummeted because the bot often provides abrupt, unhelpful answers to close tickets quickly. Based on the goal of making a model's behavior consistent with human intentions, explain why this training process failed.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research lab has developed a large language model that is highly capable of generating human-like text. However, during testing, they find it frequently produces outputs that are unhelpful, factually inaccurate, or contrary to basic ethical principles. To address this, the lab initiates a new phase of training that specifically uses human preferences and feedback to steer the model's outputs towards being more helpful, honest, and harmless. What is the primary goal of this new training phase?
Classification of Instruction Fine-Tuning as an Alignment Problem
Evaluating Model Training Objectives
Example of Misalignment in Instruction-Following
Challenges in Defining Human Preferences for LLM Alignment
Analysis of LLM Alignment