A team is building a single encoder-decoder model intended to translate between Japanese, Korean, and Mandarin. They pre-train the model on a large, combined corpus of all three languages. However, instead of creating a unified vocabulary that includes tokens from all three languages, they use three separate, language-specific vocabularies. What is the most direct and critical consequence of this design choice on the model's translation performance?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A team is building a single encoder-decoder model intended to translate between Japanese, Korean, and Mandarin. They pre-train the model on a large, combined corpus of all three languages. However, instead of creating a unified vocabulary that includes tokens from all three languages, they use three separate, language-specific vocabularies. What is the most direct and critical consequence of this design choice on the model's translation performance?
Evaluating Pre-training Strategies for a Bilingual Model
Diagnosing Cross-Lingual Representation Issues