Learn Before
Concept
mBART Model Configuration
The authors fine-tuned and trained mBART on monolingual data of many languages, using source-side word masking and replacement (BNaziotis et al., 2021) and trarget-side word-replacement noise (Voita et al., 2021). For noisy finetuning, the authors trained “mBART + mask” (masking 10% of source tokens), “mBART + replace (enc)” (replacing 10% source tokens with random ones), and “mBART + replace (dec)” (replacing 10% of target tokens with random ones).
0
1
Updated 2023-02-17
Tags
Data Science