Example

Example of Sequence Packing for Translation

When preparing parallel text for sequence models, such as translation pairs, the source and target sentences can be packed into a single concatenated sequence using special tokens. For example, a Chinese sentence "鲸鱼 是 哺乳 动物 。" and its English translation "Whales are mammals ." can be packed as: [CLS] 鲸鱼 是 哺乳 动物 。 [SEP] Whales are mammals . [SEP]. In this structure, the [CLS] token marks the beginning of the sequence, while [SEP] tokens are used to separate the two languages and indicate the end of the entire sequence.

0

1

Updated 2026-04-18

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences