Learn Before
Relation
Transformer models using Recurrence
A natural extension to the blockwise method is to connect these blocks via recurrence.
- Transformer-XL (Dai et al., 2019)
0
1
Updated 2022-10-30
Tags
Data Science
Related
Transformer models using Fixed Patterns
Transformer models using Combination of Patterns (CP)
Transformer patterns using Learnable patterns
Transformer models using Neural Memory
Transformer models using Low-Rank Methods
Transformer models using Kernels
Transformer models using Recurrence
Transformer models using Downsampling
Transformer models using Sparse Models and Conditional Computation