1Cademy - An input sequence of 200 tokens is processed during a models self-supervised pre-training. The procedure first selects 15% of the tokens for modification. Of this selected group, 80% are replaced with a special mask symbol, 10% are replaced with a different, random token, and the final 10% are left as they are. Given this process, which statement accurately describes the state of the 200-token sequence after this modification step?

Learn Before

BERT's Masked Language Modeling Pre-training Pipeline

Multiple Choice

An input sequence of 200 tokens is processed during a model's self-supervised pre-training. The procedure first selects 15% of the tokens for modification. Of this selected group, 80% are replaced with a special mask symbol, 10% are replaced with a different, random token, and the final 10% are left as they are. Given this process, which statement accurately describes the state of the 200-token sequence after this modification step?

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related