Learn Before
Interference in Multilingual Models
A potential drawback in multilingual pre-training is the phenomenon of interference. This issue can arise when a model is trained for an excessively long duration, potentially degrading its performance on certain languages or tasks.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Related
Scaling Considerations for Multilingual Models
Interference in Multilingual Models
Evaluating a Multilingual Pre-training Strategy
A research team is pre-training a multilingual language model on a dataset containing text from 50 languages. After training, they observe that the model's performance on Swahili, a language with relatively little data in the training set, is significantly worse than its performance on high-resource languages like English and Spanish. Assuming the model architecture is sound, which of the following configuration choices is the most likely cause of this performance disparity?
A team of researchers is developing a multilingual language model and encounters several performance issues. Match each observed issue with the most likely underlying configuration factor that needs adjustment, assuming the model's architecture is fixed.
Learn After
A research team is training a large multilingual language model on a dataset containing English, Spanish, and Swahili. They observe that after an extensive number of training steps, the model's performance on a Swahili translation task begins to degrade, even though its performance on English remains strong and the overall training loss continues to decrease. Which of the following concepts best explains this specific outcome?
Diagnosing Performance Degradation in a Multilingual Model
Addressing Performance Imbalance in a Multi-Language Model
Early Stopping in Multilingual Pre-training