Learn Before
Prediction Challenge in Span Masking
A key challenge introduced by span masking is that the model must learn to predict the number of tokens that were originally part of a masked span. This is necessary because spans of varying lengths are all replaced by a single [MASK] token, so the model needs to determine the correct length of the text to generate.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of Span Masking
Prediction Challenge in Span Masking
An engineer is analyzing a text corruption technique and observes the following input-output pair:
Original Text: 'The very happy dog played in the big yard.' Processed Text: 'The [MASK] dog played [MASK] yard.'
Based on this single example, what can be definitively concluded about the rules of this corruption technique?
Reconstructing Masked Spans
Consider a text corruption technique where non-overlapping segments of text are randomly selected and each chosen segment is replaced by a single
[MASK]token. According to this technique, if the three-word segment 'the big dog' is selected from a sentence, it would be replaced by '[MASK] [MASK] [MASK]'.A text corruption technique involves selecting one or more non-overlapping segments of text (spans) and replacing each entire segment with a single
[MASK]token. This technique also allows for selecting zero-length spans, which results in inserting a[MASK]token. Given the original sentence: 'The quick brown fox jumps over the lazy dog.', which of the following outputs represents an invalid application of this technique?
Learn After
Analogy between Span Length Prediction and Fertility Modeling
A language model is tasked with filling in a masked portion of text. The input provided is: 'The solar system consists of the Sun and the
[MASK]that orbit it.' The original, unmasked text was: 'The solar system consists of the Sun and the celestial bodies that orbit it.' The model, however, generates the following output: 'The solar system consists of the Sun and the planets that orbit it.' Which statement provides the most accurate analysis of the model's primary error in this specific case?The Variable-Length Prediction Problem
Diagnosing a Model's Infilling Errors