1Cademy - Consecutive Token Masking in MLM

Learn Before

Self-Supervised Pre-training of Encoders via Masked Language Modeling

Concept

Consecutive Token Masking in MLM

In Masked Language Modeling, the random selection of tokens for masking can result in the masking of consecutive tokens within a sequence [Joshi et al., 2020]. This means that two or more adjacent words may be replaced by [MASK] symbols in a single training instance.