Learn Before
Analyzing Pre-training Strategies for Multilingual Models
A language model can be pre-trained for multilingual tasks in two ways: (A) on a large collection of documents where each document is in a single language (e.g., English or German), or (B) on a collection of sentence pairs, where each pair consists of a sentence and its direct translation (e.g., an English sentence and its German translation). Analyze why approach (B) is generally more effective for developing a model with strong cross-lingual transfer abilities.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Cross-Lingual Language Models (XLM)
Bilingual Sentence Packing for Pre-training
Performance Degradation due to Interference in Bilingual Pre-training
An NLP team is developing a model for a Spanish-to-Portuguese translation service. They are considering two different pre-training strategies before fine-tuning the model on a specific translation dataset.
Strategy 1: The model is trained on a large corpus containing millions of Spanish documents and a separate, equally large corpus of Portuguese documents. During each training step, the model processes text from only one of the two languages.
Strategy 2: The model is trained on a large corpus of Spanish sentences that have been professionally translated into Portuguese. During each training step, the model processes a Spanish sentence and its corresponding Portuguese translation together.
Which statement best analyzes the likely effectiveness of these two strategies for the final translation task?
Analyzing Pre-training Strategies for Multilingual Models
Pre-training Strategy for Zero-Shot Cross-Lingual Transfer