Learn Before
Concept
BERT Ablation Studies Examples
Some examples of ablation studies:
- No NSP: A bidirectional model which is still trained using the “Masked LM” but doesn’t include the “next sentence prediction” (NSP) task
- LTR and No NSP: Same as above but unidirectional, reads left to right and doesn’t use masking
- Trying to understand the effect of model size by building versions of BERT with varying numbers of layers, hidden units, etc but using the same hyperparameters and training procedure on each
0
1
Updated 2021-08-12
Tags
Data Science