Essay

Evaluating Training Objectives for a Chatbot

A company is developing a customer service chatbot. They have two primary training datasets. Dataset A consists of customer queries, each paired with a single, ideal response written by an expert. The training goal is to maximize the likelihood that the model generates this exact ideal response. Dataset B consists of customer queries, each paired with two different model-generated responses, and a label indicating which response a human preferred. The training goal is to generate responses that are more likely to be preferred by humans.

Analyze these two training approaches. Which approach is better suited for ensuring factual accuracy, and which is better for capturing a helpful and polite tone? Justify your reasoning by explaining the fundamental difference in their optimization objectives.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science