Learn Before
Example of Representing Masked Spans with Sentinel Tokens
To illustrate the use of sentinel tokens for masked spans, consider a training example where the consecutive words 'frolicking' and 'the house' are masked. The corrupted input sequence is represented with placeholder slots:
[CLS] The puppies are [X] outside [Y] .
The training task is to generate a target output sequence that fills these slots with the correct tokens, which can be expressed as:
<s> [X] frolicking [Y] the house [Z]
Here, the unique sentinel tokens [X] and [Y] replace the masked spans in the input, and the target sequence uses these same tokens to provide the missing text, concluding with a final sentinel [Z].
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of Representing Masked Spans with Sentinel Tokens
A language model is given the input sentence: 'The new algorithm significantly improves performance on the benchmark.' In this instance, the consecutive words 'significantly improves' and 'on the benchmark' are both masked for a training task. How would this input be correctly represented if the modeling approach replaces each contiguous masked span with a single, unique placeholder token?
In a text processing approach where a single, unique placeholder replaces a span of one or more masked words, it is permissible to use the same placeholder (e.g.,
[X]) to represent two different, non-overlapping masked spans within the same input sentence.Consider the following input sequence where several consecutive tokens have been masked:
The model was trained [MASK] [MASK] a large dataset and then [MASK] [MASK] [MASK] on a specific task.Rewrite this sequence by replacing each contiguous span of masked tokens with a single, unique sentinel token, starting with[X]. Your answer should be the complete, rewritten sequence: ____
Learn After
Example of Target Output Format for Span-Based Denoising
Consider the following sentence where two spans of text have been selected for masking: 'The team celebrated after their [innovative new model] achieved [state-of-the-art results].' How would this sentence be represented if unique sentinel tokens are used to replace each complete masked span?
Match each original sentence, with text to be masked enclosed in brackets, to its correct representation using unique sentinel tokens.
Applying Sentinel Tokens to Masked Spans