Concept

Pre-training tasks

Pre-training tasks are fundamental for developing universal language representations. After a model undergoes pre-training, various techniques are employed to adapt it for specific downstream applications. The pre-training tasks themselves are generally classified into three main types:

  • Supervised learning: Training on data with explicit input-output pairs.
  • Unsupervised learning: Discovering inherent knowledge from unlabeled data, such as through probabilistic language modeling.
  • Self-Supervised learning: A hybrid approach where supervision signals are automatically generated from the data itself, as seen in masked language modeling (MLM).

0

1

Updated 2026-04-17

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Related