Learn Before
Concept

Techniques for Mitigating the Training Gap

  • Add a teacher assistant mitigates the training gap, which is further reduced by residual learning, where the assistant structure is used to learn the residual error.
  • Minimize the difference in teacher and student model structure by combining network quantization with knowledge distillation.
  • Structure compression method: Transfers the knowledge of multiple layers to a single layer. In online settings, teacher networks are ensembles of similarly-structured student networks.

0

1

Updated 2022-10-29

Tags

Deep Learning (in Machine learning)

Learn After