Bilingual Sentence Packing for Pre-training
A specific technique used in bilingual pre-training involves sampling a pair of aligned sentences from two different languages. These sentences are then concatenated to form a single, combined sequence that is used as a training input. This method directly exposes the model to parallel data within a single training instance, facilitating cross-lingual learning.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Cross-Lingual Language Models (XLM)
Bilingual Sentence Packing for Pre-training
Performance Degradation due to Interference in Bilingual Pre-training
An NLP team is developing a model for a Spanish-to-Portuguese translation service. They are considering two different pre-training strategies before fine-tuning the model on a specific translation dataset.
Strategy 1: The model is trained on a large corpus containing millions of Spanish documents and a separate, equally large corpus of Portuguese documents. During each training step, the model processes text from only one of the two languages.
Strategy 2: The model is trained on a large corpus of Spanish sentences that have been professionally translated into Portuguese. During each training step, the model processes a Spanish sentence and its corresponding Portuguese translation together.
Which statement best analyzes the likely effectiveness of these two strategies for the final translation task?
Analyzing Pre-training Strategies for Multilingual Models
Pre-training Strategy for Zero-Shot Cross-Lingual Transfer
Bilingual Sentence Packing for Pre-training
Pre-training Strategy for a Multilingual Model
A researcher is pre-training a multilingual model using a masked language modeling (MLM) objective. To align the pre-training process with the specific methodology of Cross-Lingual Language Models (XLMs), what is the most crucial characteristic of the input data?
Core Training Principle of XLM
Translation Language Modeling
Input Embedding in Cross-Lingual Language Models
Learn After
A pre-training strategy for a multilingual model involves taking an aligned sentence pair (e.g., an English sentence and its German translation) and concatenating them to form a single input sequence for one training step. What is the primary advantage of this method compared to training the model on the English and German sentences in separate, independent training steps?
Example of an Aligned Bilingual Sentence Pair
Constructing a Packed Bilingual Input
A researcher is pre-training a cross-lingual language model using a technique that combines sentences from two different languages into a single training input. Arrange the following steps to accurately describe this process.