Concept

Bilingual Pre-training for Multilingual Models

A significant improvement for multilingual pre-trained models, such as mBERT, involves incorporating bilingual data into the pre-training process. In contrast to training on separate monolingual corpora, this approach explicitly models the relationships between tokens from two different languages. This method equips the model with inherent cross-lingual transfer abilities, making it more readily adaptable to new languages.

0

1

Updated 2026-04-18

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences