Learn Before
Multiple Choice

A research team is training a model to score the quality of text responses. The training data consists of pairs of responses, where for each pair, one is labeled as 'better' than the other. The model's objective is to assign a higher score to the 'better' response in every pair. The team successfully trains two models, Model A and Model B. They discover that the internal parameters of Model A and Model B are significantly different. However, both models achieve 100% accuracy on the training data, correctly assigning a higher score to the 'better' response in every single pair. What fundamental principle of model training does this outcome best demonstrate?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science