Learn Before
Concept
General Attention function
This is similar to dot attention but also adds a learning aspect to it because we first pass the encoder vector through a Dense layer before taking the dot product. In this case attention is also is subject to backpropagation and gradient descent
- encoder state - decoder state
0
1
Updated 2020-10-10
Tags
Data Science