1Cademy - A research team is fine-tuning a large language model using a combined loss objective, which includes both a standard language model (LM) loss against ground-truth data and a knowledge distillation (KD) loss from a weaker supervisor model. They observe that while the large model is very good at mimicking the *style and general structure* of the weak supervisors outputs, it frequently makes factual errors that are not present in the ground-truth dataset. Which of the following is the most likely cause of this issue and the best corrective action?

Learn Before

Combined Loss Objective in Weak-to-Strong Training

Multiple Choice

A research team is fine-tuning a large language model using a combined loss objective, which includes both a standard language model (LM) loss against ground-truth data and a knowledge distillation (KD) loss from a weaker supervisor model. They observe that while the large model is very good at mimicking the style and general structure of the weak supervisor's outputs, it frequently makes factual errors that are not present in the ground-truth dataset. Which of the following is the most likely cause of this issue and the best corrective action?

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related