Learn Before
Concept
Adapting the FFN for Larger Capacity in transformers
The basic concept behind this is to replace FFNs with similar structures with much more parameters, which supports a larger model capacity
0
1
Updated 2022-05-26
Tags
Data Science