Case Study

Diagnosing Flawed AI Training

A development team is refining a large language model using a two-stage training process. In the first stage, they train a 'scoring' model on a dataset of human-ranked conversations to predict which responses humans would prefer. In the second stage, they use this scoring model to provide feedback to the main language model, encouraging it to generate responses that receive high scores. After deployment, the team notices that the model often produces responses that are grammatically perfect and very polite, but are also extremely vague and non-committal, effectively avoiding answering the user's actual question. During internal testing, they confirm that the scoring model consistently gives these vague responses very high ratings. Based on this information, which of the two stages is the most likely root cause of the model's unhelpful behavior, and why?

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science