Learn Before
Comparison

Comparison of Pre-training Paradigms

Pre-training paradigms differ primarily in their data requirements, underlying assumptions, and adaptation methods. Unsupervised pre-training utilizes large-scale unlabeled data to create a good preliminary starting point, though it still requires considerable effort to further train the model with labeled data. Supervised pre-training relies on labeled datasets from the outset, operating on the assumption that different tasks are related, which allows a model trained on one task to be transferred to another via tuning. In contrast, self-supervised pre-training leverages large-scale unlabeled data to train a model that can be efficiently adapted to new tasks through methods like fine-tuning or prompting.

Image 0

0

1

Updated 2026-04-14

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences