Example of an Unchanged Token in a BERT Input Sequence
To illustrate the strategy of leaving a selected token unchanged in BERT's Masked Language Modeling, consider the original input: [CLS] It is raining . [SEP] I need an umbrella . [SEP]. If the token 'I' is chosen for prediction but falls under the 10% rule where the token is left as is, the input sequence fed to the model remains identical to the original. Despite the token not being masked or altered, the model is still tasked with predicting 'I' based on the surrounding context.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of an Unchanged Token in a BERT Input Sequence
A language model is pre-trained using a method where 15% of the words in an input sentence are selected for prediction. Of these selected words, a small fraction (10%) are intentionally left in their original form, while the model is still tasked with predicting them based on the surrounding context. What is the most significant reason for this strategy of leaving some target words unchanged?
Calculating Token Modifications in Pre-training
Critique of a Modified Pre-training Strategy
Purpose of Unchanged Tokens in BERT's MLM Strategy
Example of Token Masking in a BERT Input Sequence
Example of an Unchanged Token in a BERT Input Sequence
Example of Random Token Replacement in a BERT Input Sequence
A language model is designed to process pairs of sentences by concatenating them into a single sequence. This model requires a special token at the beginning of the entire sequence to be used for classification tasks, and another special token to mark the boundary between the two sentences and to signify the end of the sequence. Given the two sentences 'The sky is blue.' and 'The grass is green.', which of the following options correctly formats them as a single input sequence for this model?
Debugging Model Input Formatting
Analyzing Input Sequence Structure
Learn After
During a language model's training, a specific token is chosen from an input sequence to be predicted. In a small percentage of cases, the training strategy requires this chosen token to be left as-is, without being replaced. Consider the original sequence:
[CLS] The quick brown fox jumps . [SEP]. If the token 'fox' is selected for prediction but falls under the rule where it remains unchanged, what is the final input sequence fed to the model for this training step?Analyzing a Language Model's Training Process
Language Model Training Task