1Cademy - Example of Random Token Replacement in a BERT Input Sequence

Learn Before

Random Token Replacement in BERT's MLM Strategy
Example of a Two-Sentence Input for BERT

Example

Example of Random Token Replacement in a BERT Input Sequence

To illustrate the random token replacement strategy in BERT's Masked Language Modeling, consider the original two-sentence input: [CLS] It is raining . [SEP] I need an umbrella . [SEP]. If the token 'umbrella' is selected for modification under the 10% random replacement rule, it is substituted with a random token from the vocabulary, such as 'hat'. This results in the modified sequence: [CLS] It is raining . [SEP] I need an hat . [SEP]. The model's task is then to predict the original word 'umbrella' from this corrupted input.

Updated 2026-05-02

Contributors are: