1Cademy - Masked Language Modeling

Learn Before

General Formulation of a Sequence Model

Masked Language Modeling

Masked Language Modeling (MLM) is a self-supervised learning objective where a model is trained to predict tokens that have been randomly masked in an input sequence. This approach allows the model to learn deep bidirectional representations by using both left and right contexts. For instance, if tokens $x_1$ and $x_3$ are masked in a sequence, the model's task is to predict these original tokens from the corrupted input, which can be represented as $(x_0, \text{[MASK]}, x_2, \text{[MASK]}, x_4) \rightarrow (x_1, x_3)$ . The model predicts each masked token based on the full context of unmasked tokens, calculating conditional probabilities like $\text{Pr}(x_1|\mathbf{e}_0, \mathbf{e}_{\text{mask}}, \mathbf{e}_2, \mathbf{e}_{\text{mask}}, \mathbf{e}_4)$ and $\text{Pr}(x_3|\mathbf{e}_0, \mathbf{e}_{\text{mask}}, \mathbf{e}_2, \mathbf{e}_{\text{mask}}, \mathbf{e}_4)$ , where $\mathbf{e}$ represents token embeddings.

0

1

7 days ago

Contributors are:

Who are from:

References

Learn Before

Related