1Cademy - A team is fine-tuning a large, powerful model to perform a specific task. Instead of using a dataset with pre-defined correct answers, they use a smaller, weaker model as a live supervisor. For each input, the large model generates an output, and the weaker model also generates an output. A loss value is then calculated based on the difference between these two outputs. What is the direct and immediate purpose of this calculated loss value within the training loop?

Learn Before

Direct Supervision via Knowledge Distillation Loss in Weak-to-Strong Generalization

Multiple Choice

A team is fine-tuning a large, powerful model to perform a specific task. Instead of using a dataset with pre-defined correct answers, they use a smaller, weaker model as a live supervisor. For each input, the large model generates an output, and the weaker model also generates an output. A loss value is then calculated based on the difference between these two outputs. What is the direct and immediate purpose of this calculated loss value within the training loop?

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related