1Cademy - Calculating Token Modifications in Pre-training

Learn Before

Unchanged Tokens in BERT's MLM Strategy

Short Answer

Calculating Token Modifications in Pre-training

During a language model's pre-training phase, 15% of tokens in each sequence are selected for a prediction task. Of these selected tokens, 10% are left in their original form. If a given input sequence contains 4,000 tokens, how many tokens would you expect to be selected for prediction but remain unchanged in the input? Provide only the final numerical answer.

Updated 2025-10-03

Contributors are:

Who are from:

Learn Before

Related