Evaluating Pre-training Strategies for a Bilingual Model
A company is developing a single sequence-to-sequence model to handle customer service requests in both English and Spanish. They are considering two pre-training strategies. Evaluate the two strategies below and determine which is more likely to result in a single, effective bilingual model. Justify your choice based on the principles of training such models.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A team is building a single encoder-decoder model intended to translate between Japanese, Korean, and Mandarin. They pre-train the model on a large, combined corpus of all three languages. However, instead of creating a unified vocabulary that includes tokens from all three languages, they use three separate, language-specific vocabularies. What is the most direct and critical consequence of this design choice on the model's translation performance?
Evaluating Pre-training Strategies for a Bilingual Model
Diagnosing Cross-Lingual Representation Issues