Learn Before
Span Masking
Span masking is an input corruption technique in which non-overlapping spans of tokens are randomly sampled from a sequence, and each span is replaced by a single [MASK] token. This approach uniquely accommodates spans of length 0, where a [MASK] token is simply inserted at a chosen position in the sequence without removing any original tokens.
0
1
References
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example Comparison of Token Masking and Token Deletion
Span Masking
A common technique to create a 'noisy' version of a text sequence for model training involves randomly selecting individual words and replacing each one with a special marker, such as
[MASK]. Given the original sentence: 'The quick brown fox jumps over the lazy dog.', which of the following options correctly demonstrates this specific technique?Identifying an Input Alteration Procedure
A data scientist is preparing text for a model training process. The goal is to corrupt the input by replacing individual words with a special
[MASK]marker, while keeping the total number of words (including the markers) the same as the original. Given the original sentence: 'The model must predict the original words from the altered input.', which of the following sentences correctly applies this specific technique?
Learn After
Example of Span Masking
Prediction Challenge in Span Masking
An engineer is analyzing a text corruption technique and observes the following input-output pair:
Original Text: 'The very happy dog played in the big yard.' Processed Text: 'The [MASK] dog played [MASK] yard.'
Based on this single example, what can be definitively concluded about the rules of this corruption technique?
Reconstructing Masked Spans
Consider a text corruption technique where non-overlapping segments of text are randomly selected and each chosen segment is replaced by a single
[MASK]token. According to this technique, if the three-word segment 'the big dog' is selected from a sentence, it would be replaced by '[MASK] [MASK] [MASK]'.A text corruption technique involves selecting one or more non-overlapping segments of text (spans) and replacing each entire segment with a single
[MASK]token. This technique also allows for selecting zero-length spans, which results in inserting a[MASK]token. Given the original sentence: 'The quick brown fox jumps over the lazy dog.', which of the following outputs represents an invalid application of this technique?