Learn Before
Concept
Techniques for Mitigating the Training Gap
- Add a teacher assistant mitigates the training gap, which is further reduced by residual learning, where the assistant structure is used to learn the residual error.
- Minimize the difference in teacher and student model structure by combining network quantization with knowledge distillation.
- Structure compression method: Transfers the knowledge of multiple layers to a single layer. In online settings, teacher networks are ensembles of similarly-structured student networks.
0
1
Updated 2022-10-29
Tags
Deep Learning (in Machine learning)