Learn Before
Concept

Weight typing

Weight-tying is where you have a language model and use the same weight matrix for the input-to-embedding layer (the input embedding) and the hidden-to-softmax layer (the output embedding). The idea is that these two matrices contain essentially the same information, each having a row per word in the vocabulary.

In addition to providing improved perplexity results, this approach significantly reduces the number of parameters required for the model.

0

1

Updated 2021-11-21

Tags

Data Science