1Cademy - Calculating Contribution to MLM Training Objective

Learn Before

MLM Training Objective as Maximum Likelihood Estimation

Case Study

Calculating Contribution to MLM Training Objective

A language model is being trained on the original sentence 'The quick brown fox.' During one training step, the model receives the masked input 'The quick [MASK] fox.' and produces the following probability distribution for the masked position:

P('brown' | 'The quick [MASK] fox') = 0.7
P('red' | 'The quick [MASK] fox') = 0.2
P('lazy' | 'The quick [MASK] fox') = 0.1

Based on the maximum likelihood estimation objective used in this type of training, calculate the specific log-probability value that this single masked token contributes to the total objective function. Explain the significance of this calculated value for the model's training process.

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Learn Before

Related