Learn Before
Relation
What problem is this paper trying to solve?
- RNN, LSTM, and gated RNNs are the popularly used approaches used for sequence modeling tasks such as machine translation and language modeling,
- RNN/CNN handles sequences word by word in a sequential fashion, the sequentiality is an obstacle towards parallelization of the process. Moreover, when sequences are too long, the model is prone to forgetting the content of distant positions in sequence or mix it with the content of the following positions.
- Attention mechanisms are one of the solutions to overcome the problem of model forgetting. This is because they allow dependency modeling without considering their distance in the input or output sequences.
0
1
Updated 2021-08-18
Tags
Data Science