Concept

Limitation of MLM: Ignoring Dependencies Between Masked Tokens

A key limitation of the auto-encoding objective in Masked Language Modeling (MLM) is its failure to account for dependencies among the masked tokens. The model is trained to predict each masked token independently of the others. For example, if two tokens x2x_2 and x6x_6 in a sequence are masked, the prediction for the first masked token (x2x_2) is generated independently of the second masked token (x6x_6), even though x6x_6 should ideally be considered within the context of x2x_2.

0

1

Updated 2026-04-15

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences