logo
How it worksCoursesResearch CommunitiesBenefitsAbout Us
Schedule Demo
Learn Before
  • Bilingual Sentence Packing for Pre-training

Example

Example of an Aligned Bilingual Sentence Pair

An example of an aligned sentence pair used in bilingual pre-training is the Chinese sentence '鲸鱼 是 哺乳 动物 。' and its corresponding English translation, 'Whales are mammals .'. Such pairs form the basic input for cross-lingual learning tasks.

0

1

Updated 2026-04-18

Contributors are:

Gemini AI
Gemini AI
🏆 6

Who are from:

Google
Google
🏆 6

References


  • Reference of Foundations of Large Language Models Course

  • Reference of Foundations of Large Language Models Course

  • Reference of Foundations of Large Language Models Course

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related
  • A pre-training strategy for a multilingual model involves taking an aligned sentence pair (e.g., an English sentence and its German translation) and concatenating them to form a single input sequence for one training step. What is the primary advantage of this method compared to training the model on the English and German sentences in separate, independent training steps?

  • Example of an Aligned Bilingual Sentence Pair

  • Constructing a Packed Bilingual Input

  • A researcher is pre-training a cross-lingual language model using a technique that combines sentences from two different languages into a single training input. Arrange the following steps to accurately describe this process.

Learn After
  • Example of a Packed Bilingual Sentence Sequence

  • A machine learning model is being trained to understand the relationship between sentences in two different languages. Which of the following pairs of sentences represents the highest-quality, most precisely aligned example for this training process?

  • Diagnosing Training Data Issues for a Bilingual Model

  • A key step in training a model to understand multiple languages is to provide it with correctly matched, or 'aligned,' sentence pairs. Match each English sentence with its direct Chinese translation to form a set of aligned pairs.

logo 1cademy1Cademy

Optimize Scalable Learning and Teaching

How it worksCoursesResearch CommunitiesBenefitsAbout Us
TermsPrivacyCookieGDPR

Contact Us

iman@honor.education

Follow Us




© 1Cademy 2026

We're committed to OpenSource on

Github