1Cademy - Distillation Loss for Relation-Based Knowledge

How it works Courses Research Communities Benefits About Us

Learn Before

Relation-based knowledge

Formula

Distillation Loss for Relation-Based Knowledge

The distillation loss for relation-based knowledge transfer, based on the relations of feature maps, is calculated as:

$L_{RelD}(f_t, f_s) = L_{R^1}(\psi_t(\acute{f_t}, \check{f_t}), \psi_s(\acute{f_s}, \check{f_s}))$

Where:

$f_t$ and $f_s$ are feature maps of the teacher and student models, respectively.
$\acute{f_t}$ and $\check{f_t}$ are pairs of feature maps chosen from the teacher.
$\acute{f_s}$ and $\check{f_s}$ are pairs of feature maps chosen from the student.
$\psi_t(\cdot)$ and $\psi_s(\cdot)$ are similarity functions for pairs of feature maps from the models.
$L_{R^1}(\cdot)$ is the correlation function between the teacher and student feature maps.

0

1

Updated 2026-05-10

Contributors are:

Lois Wong

Gemini AI

Who are from:

University of California, Berkeley

University of California, Berkeley

Google

Tags

Deep Learning (in Machine learning)

Data Science

Computing Sciences

Related