Multiple Choice

An engineer is training a student language model using a combined objective that balances learning from a teacher model's predictions (distillation loss) and learning from the ground-truth data (standard loss). The interpolation coefficient, λ, weighs the teacher's influence. The engineer observes that the student model quickly learns to mimic the teacher's output, but its performance on a validation set eventually plateaus and fails to surpass the teacher's performance, even though the student has the capacity to do better. What is the most probable cause of this issue related to the adjustment of λ?

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science