Learn Before
Short Answer

Analyzing the Challenge of Consecutive Masking

A language model is pre-trained using a masked language modeling objective. During one training step, it sees the input: The cat sat on the [MASK] [MASK]. Explain why predicting the two masked tokens in this scenario is a more challenging task for the model than predicting two separate, non-adjacent masked tokens in a longer sentence (e.g., The [MASK] cat sat on the warm [MASK].).

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science