Token Masking as an Input Corruption Method
Token masking is a technique for corrupting input data where specific tokens within a sequence are chosen at random and replaced with a special [MASK] token. This method is identical to the masking strategy employed in Masked Language Modeling (MLM).
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Token Masking as an Input Corruption Method
Token Deletion as an Input Corruption Method
Combining Multiple Corruption Methods in Pre-training
Selecting Appropriate Input Corruption Methods
Token Alteration as an Input Corruption Method
Token Reordering as an Input Corruption Method
Input Corruption Methods for Multi-Sentence Sequences
Input Corruption Methods for Multi-Sentence Sequences
Corruption Methods for Multi-Sentence Sequences
A research team is pre-training an encoder-decoder model using a denoising objective. Their primary goal is to create a model that excels at summarizing long documents, which requires a deep understanding of the text's overall semantic content and logical flow, rather than its exact word-for-word structure. Which of the following input corruption strategies would be most aligned with this specific goal?
You are training an encoder-decoder model with a denoising objective. Match each input corruption method with the primary linguistic capability it is designed to teach the model.
Diagnosing Pre-training Deficiencies
Learn After
Example Comparison of Token Masking and Token Deletion
Span Masking
A common technique to create a 'noisy' version of a text sequence for model training involves randomly selecting individual words and replacing each one with a special marker, such as
[MASK]. Given the original sentence: 'The quick brown fox jumps over the lazy dog.', which of the following options correctly demonstrates this specific technique?Identifying an Input Alteration Procedure
A data scientist is preparing text for a model training process. The goal is to corrupt the input by replacing individual words with a special
[MASK]marker, while keeping the total number of words (including the markers) the same as the original. Given the original sentence: 'The model must predict the original words from the altered input.', which of the following sentences correctly applies this specific technique?