Learn Before
Input Embedding in Cross-Lingual Language Models
In the work of Lample and Conneau on cross-lingual language models, the input embedding for a specific token (denoted as ) is calculated as the sum of its token embedding, positional embedding, and a language embedding. The inclusion of a language embedding requires assigning a language label to each token, which enables the model to distinguish between tokens from different languages.
0
1
Tags
Foundations of Large Language Models
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Bilingual Sentence Packing for Pre-training
Pre-training Strategy for a Multilingual Model
A researcher is pre-training a multilingual model using a masked language modeling (MLM) objective. To align the pre-training process with the specific methodology of Cross-Lingual Language Models (XLMs), what is the most crucial characteristic of the input data?
Core Training Principle of XLM
Translation Language Modeling
Input Embedding in Cross-Lingual Language Models