Learn Before
Performance Degradation due to Interference in Bilingual Pre-training
During the pre-training of a bilingual model, a phenomenon known as interference can occur. This means that after a certain amount of training, the model's overall performance may begin to decline rather than continue improving.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Related
Cross-Lingual Language Models (XLM)
Bilingual Sentence Packing for Pre-training
Performance Degradation due to Interference in Bilingual Pre-training
An NLP team is developing a model for a Spanish-to-Portuguese translation service. They are considering two different pre-training strategies before fine-tuning the model on a specific translation dataset.
Strategy 1: The model is trained on a large corpus containing millions of Spanish documents and a separate, equally large corpus of Portuguese documents. During each training step, the model processes text from only one of the two languages.
Strategy 2: The model is trained on a large corpus of Spanish sentences that have been professionally translated into Portuguese. During each training step, the model processes a Spanish sentence and its corresponding Portuguese translation together.
Which statement best analyzes the likely effectiveness of these two strategies for the final translation task?
Analyzing Pre-training Strategies for Multilingual Models
Pre-training Strategy for Zero-Shot Cross-Lingual Transfer