Concept

Adapting the FFN for Larger Capacity in transformers

The basic concept behind this is to replace FFNs with similar structures with much more parameters, which supports a larger model capacity

0

1

Updated 2022-05-26

Contributors are:

Who are from:

Tags

Data Science