Example of Random Token Replacement in a BERT Input Sequence
To illustrate the random token replacement strategy in BERT's Masked Language Modeling, consider the original two-sentence input: [CLS] It is raining . [SEP] I need an umbrella . [SEP]. If the token 'umbrella' is selected for modification under the 10% random replacement rule, it is substituted with a random token from the vocabulary, such as 'hat'. This results in the modified sequence: [CLS] It is raining . [SEP] I need an hat . [SEP]. The model's task is then to predict the original word 'umbrella' from this corrupted input.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of Random Token Replacement in a BERT Input Sequence
In a language model's pre-training, a portion of input tokens selected for prediction are substituted with a completely random token from the vocabulary, rather than always using a special placeholder like
[MASK]. What is the primary analytical justification for this specific strategy?Predicting from Corrupted Input
A language model's pre-training process involves selecting a subset of tokens in an input sequence for prediction. One modification technique applied to these selected tokens is to substitute them with a completely random token from the model's vocabulary. Given the original sequence:
The cat sat on the mat.If the tokensatis chosen for this specific random replacement technique, which of the following is a valid resulting sequence?Example of Token Masking in a BERT Input Sequence
Example of an Unchanged Token in a BERT Input Sequence
Example of Random Token Replacement in a BERT Input Sequence
A language model is designed to process pairs of sentences by concatenating them into a single sequence. This model requires a special token at the beginning of the entire sequence to be used for classification tasks, and another special token to mark the boundary between the two sentences and to signify the end of the sequence. Given the two sentences 'The sky is blue.' and 'The grass is green.', which of the following options correctly formats them as a single input sequence for this model?
Debugging Model Input Formatting
Analyzing Input Sequence Structure
Learn After
A language model is being trained using a technique where some words in the input are altered to help the model learn. Consider the original input sequence:
[CLS] My dog chased the ball . [SEP] He brought it back . [SEP]. If the token 'ball' is selected to be replaced by a random word from the model's vocabulary, which of the following represents the most likely resulting sequence?Analyzing a Language Model's Training Method
Analyzing a Corrupted Input Sequence