logo
How it worksCoursesResearch CommunitiesBenefitsAbout Us
Schedule Demo
Learn Before
  • Knowledge Distillation

    Concept icon
Concept icon
Concept

KD Workflow

The teacher learns relationships between activations, neurons, or sample pairs. The logits or parameters of a large deep model are used as the teacher knowledge. Activations, neurons, or features of intermediate layers can also guide learning.

0

1

Concept icon
Updated 2022-10-22

Contributors are:

Lois Wong
Lois Wong
🏆 2

Who are from:

University of California, Berkeley
University of California, Berkeley
🏆 2

References


  • Knowledge Distillation: A Survey

Tags

Deep Learning (in Machine learning)

Data Science

Related
  • Components of a Knowledge Distillation System

    Concept icon
  • Extensions

    Concept icon
  • Applications

    Concept icon
  • KD Workflow

    Concept icon
  • Distilling Prompting Knowledge into Soft Prompts

    Concept icon
  • Efficient Model Deployment for Mobile Applications

  • A machine learning team is developing a compact model for a mobile application. They have a large, highly accurate 'teacher' model and a smaller 'student' model architecture. Instead of training the student model directly on the original dataset with its ground-truth labels (e.g., 'this image is a cat'), they train it to mimic the full output probability distribution of the teacher model (e.g., '90% cat, 5% dog, 1% tiger...'). Why is this technique often more effective for the student model's performance than training it from scratch on the original labels?

  • Mechanisms of Knowledge Transfer

  • Context Distillation

    Concept icon
Learn After
  • Key Challenge

    Concept icon
logo 1cademy1Cademy

Optimize Scalable Learning and Teaching

How it worksCoursesResearch CommunitiesBenefitsAbout Us
TermsPrivacyCookieGDPR

Contact Us

iman@honor.education

Follow Us




© 1Cademy 2026

We're committed to OpenSource on

Github