Learn Before
Purpose of Unchanged Tokens in BERT's MLM Strategy
In BERT's Masked Language Modeling strategy, predicting a target token that has been intentionally left unchanged in the input sequence is a relatively simple task. The purpose of this strategy is to guide the model to utilize easier, more direct evidence for its predictions, as the original token is explicitly available in the provided context.
0
1
Tags
Foundations of Large Language Models
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of an Unchanged Token in a BERT Input Sequence
A language model is pre-trained using a method where 15% of the words in an input sentence are selected for prediction. Of these selected words, a small fraction (10%) are intentionally left in their original form, while the model is still tasked with predicting them based on the surrounding context. What is the most significant reason for this strategy of leaving some target words unchanged?
Calculating Token Modifications in Pre-training
Critique of a Modified Pre-training Strategy
Purpose of Unchanged Tokens in BERT's MLM Strategy