Consider a text-infilling model that is typically trained by masking about 15% of the words in a sentence and having the model predict them based on the surrounding unmasked words. If this training process is modified to mask 100% of the words in every input sentence, what is the most significant change in the fundamental skill the model is being trained to perform?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Consider a text-infilling model that is typically trained by masking about 15% of the words in a sentence and having the model predict them based on the surrounding unmasked words. If this training process is modified to mask 100% of the words in every input sentence, what is the most significant change in the fundamental skill the model is being trained to perform?
Model Suitability for a Generation Task
Shift in Training Objective with 100% Masking