Learn Before
Concept
Additive Attention Function
It is a bit different from Multiplicative Concat score but also uses a MultiLayer perceptron:
- encoder vector - decoder previous vector
It is very similar to concat function but does not concatenate the input states but we concat them later after we apply different two Tanh Dense Layers separately to encoder and decoder states. After we concat them here we also add a singular Dense layer to represent the score. We also use here previous decoder state rather than current in Multiplicative attention
0
1
Updated 2020-10-10
Tags
Data Science