Learn Before
Factors Influencing Multilingual Pre-training
The effectiveness of a multilingual pre-trained model, assuming a fixed architecture, is determined by several key configuration choices. These include the size of the shared vocabulary, the proportion of training data allocated to each language, and the overall size of the model itself.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Cross-Lingual Learning
Bilingual Pre-training for Multilingual Models
Benefit of Multilingual Pre-trained Models: Handling Code-Switching
Shared Vocabulary in Multilingual Models
Factors Influencing Multilingual Pre-training
A company is developing a sentiment analysis tool. Their primary market is in France, for which they have a massive, high-quality dataset. They also need to provide functional support for Spanish and German, but have very limited data for these languages. The highest priority is achieving state-of-the-art performance for the French market, while still being able to handle the other languages. Given these requirements, which strategy for choosing a foundational model is most appropriate?
Model Selection for a Monolingual Task
Match each pre-trained model with the description that best characterizes its training methodology and primary use case.
Learn After
Scaling Considerations for Multilingual Models
Interference in Multilingual Models
Evaluating a Multilingual Pre-training Strategy
A research team is pre-training a multilingual language model on a dataset containing text from 50 languages. After training, they observe that the model's performance on Swahili, a language with relatively little data in the training set, is significantly worse than its performance on high-resource languages like English and Spanish. Assuming the model architecture is sound, which of the following configuration choices is the most likely cause of this performance disparity?
A team of researchers is developing a multilingual language model and encounters several performance issues. Match each observed issue with the most likely underlying configuration factor that needs adjustment, assuming the model's architecture is fixed.