Example

Example of an Unchanged Token in a BERT Input Sequence

To illustrate the strategy of leaving a selected token unchanged in BERT's Masked Language Modeling, consider the original input: [CLS] It is raining . [SEP] I need an umbrella . [SEP]. If the token 'I' is chosen for prediction but falls under the 10% rule where the token is left as is, the input sequence fed to the model remains identical to the original. Despite the token not being masked or altered, the model is still tasked with predicting 'I' based on the surrounding context.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences