1Cademy - Analyzing the Components of a Model Mimicry Loss Function

Learn Before

General Loss Function for Knowledge Distillation

Short Answer

Analyzing the Components of a Model Mimicry Loss Function

In the context of training a smaller model to mimic a larger one, the training objective is to minimize the loss function, formally expressed as $Loss(\text{Pr}^t(\cdot|\cdot), \text{Pr}_{\theta}^s(\cdot|\cdot), \mathbf{x})$ . Identify which component of this function is directly adjusted during the training process and explain why the other components are considered fixed.

Updated 2025-10-08

Contributors are:

Who are from:

Learn Before

Related