Comparison

Comparison of Masked vs. Causal Language Modeling

Causal Language Modeling (CLM), also known as conventional language modeling, can be understood as a specific instance of Masked Language Modeling (MLM). In CLM, the prediction of a token at a given position is constrained by masking all subsequent tokens in the right-hand context. This forces the model to rely exclusively on the preceding left-hand context, making it a unidirectional process. In contrast, the general MLM approach is bidirectional because it uses all unmasked tokens—from both the left and right contexts—to predict a masked token within a sequence.

0

1

Updated 2026-04-15

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences