Short Answer

Rationale for a Hybrid Training Objective

A team is training a large model using a composite loss function. This function has two parts:

  1. A component that penalizes the large model when its output differs from a weaker, pre-existing model's output.
  2. A component that penalizes the large model when its output differs from a small set of human-verified, ground-truth labels.

Analyze the distinct contribution of each of these two components to the overall training process. Why is it beneficial to use both together rather than just one?

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science