Learn Before
Theory
Benchmark Model in Knowledge Distillation
The benchmark model in knowledge distillation employs a joint loss function that combines the distillation loss and the student loss. The student loss is typically the cross-entropy loss between the ground truth label and the soft logits of the student model, expressed as .
0
1
Updated 2026-05-10
Contributors are:
Who are from:
Tags
Deep Learning (in Machine learning)
Data Science
Computing Sciences