Learn Before
General Formulation of a Sequence Model
Comparison of Causal and Masked Language Modeling
Causal Language Modeling (CLM) and Masked Language Modeling (MLM) are two primary pre-training objectives for language models. The key difference lies in the context available for prediction. CLM is unidirectional (auto-regressive), predicting a token using only the preceding tokens , as described in the text passage (e.g., ). This is suitable for generative tasks. In contrast, MLM is bidirectional, predicting a masked token using both its left and right context, as shown in the image (e.g., predicting a masked using ). This allows the model to build a deeper understanding of language, making it well-suited for tasks like question answering and sentiment analysis.

0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Output Variation in Sequence Models
Fundamental Issues in Sequence Model Formulation
Role of the [CLS] Token in Sequence Classification
Standard Auto-Regressive Probability Factorization
Masked Language Modeling
Comparison of Causal and Masked Language Modeling
Input Formatting with Separator Tokens