Example of BERT-style Input for Masked Language Modeling
To illustrate a BERT-style input and target output format for Masked Language Modeling, consider the sentence:
"The puppies are frolicking outside the house."
By masking two tokens, such as "frolicking" and "the", the model's input becomes:
[CLS] The puppies are [MASK] outside [MASK] house.
The corresponding target output starts with a sequence start token <s> and contains the predicted original words only at the masked positions, while leaving blanks for the unmasked tokens:
<s> ___ ___ ___ frolicking ___ the ___ ___

0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of BERT-style Input for Masked Language Modeling
Example Comparison of Token Masking and Token Deletion
Consider the following two text sequences:
Sequence A: 'The puppies are frolicking outside the house .' Sequence B: 'The puppies are [MASK] outside [MASK] house .'
In the context of preparing data to train a language model, what is the primary purpose of creating Sequence B from Sequence A?
In the context of preparing data for language model training, an original sentence is often intentionally corrupted. Match each type of text sequence with its corresponding example.
Analyzing Text Corruption Techniques