1Cademy - Training Objective of Masked Language Modeling (MLM)

Learn Before

Self-Supervised Pre-training of Encoders via Masked Language Modeling

Concept

Training Objective of Masked Language Modeling (MLM)

Given an original text sequence $\mathbf{x}$ and its corrupted version $\bar{\mathbf{x}}$ , optimizing a model to predict $\mathbf{x}$ based on $\bar{\mathbf{x}}$ can be thought of as an autoencoding-like process. The fundamental training objective is to maximize the reconstruction probability $\Pr(\mathbf{x}|\bar{\mathbf{x}})$ . However, because there is a simple position-wise alignment between the two sequences, an unmasked token in $\bar{\mathbf{x}}$ is the exact same as the token in $\mathbf{x}$ at the same position. Since there is no need to consider the prediction for these unmasked tokens, the training objective is simplified to only maximize the probabilities for the masked tokens.

0

1

Updated 2026-04-15

Contributors are:

Who are from:

References

Learn Before

Related

Learn After