Learn Before
Concept

Sparsity for MoE

Sparsity means that the activated experts should be sparse among all sub-networks for computational efficiency. This can be achieved by calculating a SoftMax score for each expert, and only activate the top few.

0

1

Updated 2022-06-25

Contributors are:

Who are from:

Tags

Data Science

Related