1Cademy - Word Prediction as a Core Self-Supervised Task

Learn Before

Self-Supervised Learning

Concept

Word Prediction as a Core Self-Supervised Task

A foundational self-supervised learning strategy for prominent NLP models like BERT and GPT is based on word prediction. This approach involves training a model to predict masked or subsequent words within a vast text corpus. This simple objective enables the model to develop a general capacity for both language understanding and generation without explicit, human-provided labels.

Updated 2025-10-12

Contributors are: