During the training of a language model with a masked language modeling objective, the model is optimized to predict the entire original text sequence, including the tokens that were not masked, from the corrupted input.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Comprehension in Revised Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
MLM Training Objective using Cross-Entropy Loss
MLM Training Objective as Maximum Likelihood Estimation
A language model is being trained using a masked language modeling objective. The input is a sentence where some words have been replaced with a
[MASK]token. While the high-level goal is to enable the model to reconstruct the original sentence from this corrupted input, the practical training objective is more specific. Which statement best analyzes the actual, simplified objective the model optimizes during training and the reason for this simplification?Evaluating an MLM Training Implementation
During the training of a language model with a masked language modeling objective, the model is optimized to predict the entire original text sequence, including the tokens that were not masked, from the corrupted input.