Learn Before
Concept
Weight typing
Weight-tying is where you have a language model and use the same weight matrix for the input-to-embedding layer (the input embedding) and the hidden-to-softmax layer (the output embedding). The idea is that these two matrices contain essentially the same information, each having a row per word in the vocabulary.
In addition to providing improved perplexity results, this approach significantly reduces the number of parameters required for the model.
0
1
Updated 2021-11-21
Tags
Data Science