Concept

Consecutive Token Masking in MLM

In Masked Language Modeling, the random selection of tokens for masking can result in the masking of consecutive tokens within a sequence [Joshi et al., 2020]. This means that two or more adjacent words may be replaced by [MASK] symbols in a single training instance.

0

1

Updated 2026-04-16

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences