Concept

Token Selection and Modification Strategy in BERT's MLM

In the standard implementation of Masked Language Modeling for the BERT model, 15% of the tokens within each input sequence are randomly chosen for prediction. After these tokens are selected, the sequence is altered according to a specific modification strategy, which involves changing the selected tokens in one of three ways.

0

1

Updated 2026-04-17

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences