1Cademy - Training Objective of the Standard BERT Model

Learn Before

What is BERT?
Next Sentence Prediction (NSP)
Self-Supervised Pre-training of Encoders via Masked Language Modeling
BERT's Core Architecture

Concept

Training Objective of the Standard BERT Model

As proposed in the original paper by Devlin et al. (2019), the standard BERT model is a Transformer encoder pre-trained with a dual-task objective. This training process involves simultaneously learning from two tasks: Masked Language Modeling (MLM) and Next Sentence Prediction (NSP). The total training loss is calculated as the sum of the individual losses from these two objectives.