Concept

BERT (Bidirectional Encoder Representations from Transformers)

BERT (Bidirectional Encoder Representations from Transformers) stands out as one of the most popular and extensively used pre-trained sequence encoding models in the field of Natural Language Processing. As a foundational model, it is trained using a self-supervised approach that combines two tasks: masked language modeling (MLM) and next sentence prediction (NSP). In MLM, the model predicts randomly masked words from their context, enabling it to learn deep bidirectional language representations. This dual-task training makes BERT a versatile foundation model adaptable to a wide array of NLP applications.

0

1

Updated 2026-05-02

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Ch.2 Generative Models - Foundations of Large Language Models

Related
Learn After