Concept

Historical Context of Pre-training

While pre-training is a cornerstone of modern NLP, its conceptual origins trace back to the early history of deep learning. Initial strategies involved unsupervised learning for architectures such as RNNs, deep feedforward networks, and autoencoders. The paradigm saw a modern resurgence, driven by the success of large-scale unsupervised learning for word embedding models. Concurrently, a distinct pre-training approach became standard in computer vision, where models were trained on large, labeled datasets like ImageNet before being adapted to other tasks. This diverse history across different domains and techniques set the stage for the large-scale, self-supervised pre-training that now defines the field of NLP.

0

1

Updated 2026-04-14

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Foundations of Large Language Models

Related