Learn Before
Concept
Size of models
Background: The quantity of data in a Many-to-Many dataset increases quadratically with the number of languages, making neural networks with standard capacity underfit rapidly. What they did outstandingly: "we leverage progress in scaling (Kaplan et al., 2020; Arora et al., 2018) to train models that are over 50 times larger than current bilingual models with model parallelism"
0
1
Updated 2022-06-09
Tags
Science
Data Science