Learn Before
Example of Translation Language Modeling
An example of translation language modeling involves masking tokens in a concatenated bilingual input to force the model to learn cross-lingual alignments. Given a combined Chinese-English sequence like [CLS] [MASK] 是 [MASK] 动物 。 [SEP] Whales [MASK] [MASK] . [SEP], the model must predict the original tokens for the masked positions. To accurately predict the masked Chinese token for 'whale' (鲸鱼), the model typically needs to rely on the corresponding unmasked English word Whales from the second half of the sequence. This explicit cross-lingual dependency demonstrates how the training objective aligns representations between the two languages.
0
1
Tags
Foundations of Large Language Models
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences