1Cademy - A language model is pre-trained using a method where 15% of the words in an input sentence are selected for prediction. Of these selected words, a small fraction (10%) are intentionally left in their original form, while the model is still tasked with predicting them based on the surrounding context. What is the most significant reason for this strategy of leaving some target words unchanged?

Learn Before

Unchanged Tokens in BERT's MLM Strategy

Multiple Choice

A language model is pre-trained using a method where 15% of the words in an input sentence are selected for prediction. Of these selected words, a small fraction (10%) are intentionally left in their original form, while the model is still tasked with predicting them based on the surrounding context. What is the most significant reason for this strategy of leaving some target words unchanged?

Updated 2025-09-29

Contributors are:

Who are from:

Learn Before

Related