1Cademy - A research team is pre-training a multilingual language model on a dataset containing text from 50 languages. After training, they observe that the models performance on Swahili, a language with relatively little data in the training set, is significantly worse than its performance on high-resource languages like English and Spanish. Assuming the model architecture is sound, which of the following configuration choices is the most likely cause of this performance disparity?

Learn Before

Factors Influencing Multilingual Pre-training

Multiple Choice

A research team is pre-training a multilingual language model on a dataset containing text from 50 languages. After training, they observe that the model's performance on Swahili, a language with relatively little data in the training set, is significantly worse than its performance on high-resource languages like English and Spanish. Assuming the model architecture is sound, which of the following configuration choices is the most likely cause of this performance disparity?

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related