Diagnosing a Performance Plateau in Supervised Fine-Tuning
Analyze the following training scenario. Identify the primary limitation of the described training objective and explain how you would modify it to enable the 'student' model to potentially achieve a higher performance than its 'supervisor'.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Diagnosing a Performance Plateau in Supervised Fine-Tuning
A team is fine-tuning a large language model. They have access to a small, high-quality dataset with verified ground-truth labels, as well as a much larger dataset where labels have been generated by a weaker, smaller model. To maximize the performance of the large model by using both data sources simultaneously, which training objective should they implement?
Visual Diagram of Combined Loss Training for Weak-to-Strong Generalization
Rationale for a Hybrid Training Objective
A research team is fine-tuning a large language model using a combined loss objective, which includes both a standard language model (LM) loss against ground-truth data and a knowledge distillation (KD) loss from a weaker supervisor model. They observe that while the large model is very good at mimicking the style and general structure of the weak supervisor's outputs, it frequently makes factual errors that are not present in the ground-truth dataset. Which of the following is the most likely cause of this issue and the best corrective action?