Learn Before
Correcting a Formatted Input Sequence
A data scientist is preparing a German sentence 'Der Hund bellt .' and its English translation 'The dog barks .' for a language model. They create the following combined input sequence: [CLS] Der Hund bellt . [CLS] The dog barks . [SEP]. Identify the two errors in this sequence and explain how to correct them based on the standard formatting convention for this task.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Example of Masking a Bilingual Sentence Pair
A researcher has an aligned sentence pair: the English sentence 'The sky is blue .' and its Spanish translation 'El cielo es azul .'. To prepare this data for a language model, these two sentences must be combined into a single input sequence using special markers. Which of the following options shows the correct format for this combined sequence?
Correcting a Formatted Input Sequence
You are given an aligned sentence pair: the German sentence 'Katzen sind Tiere .' and its English translation 'Cats are animals .'. Arrange the following components into the correct single input sequence format for a bilingual model.