Learn Before
Relation
Transformer model
- The transformer model proposed in this paper is an architecture that relies entirely on the attention mechanism to draw global dependencies b/w input and output.
- It allows significantly more parallelization and has a huge role in improving translation quality after being trained for short periods of time on 8 P100 GPUs (12hrs).
0
1
Updated 2021-08-18
Tags
Data Science