Example

Example of Representing Masked Spans with Sentinel Tokens

To illustrate the use of sentinel tokens for masked spans, consider a training example where the consecutive words 'frolicking' and 'the house' are masked. The corrupted input sequence is represented with placeholder slots: [CLS] The puppies are [X] outside [Y] . The training task is to generate a target output sequence that fills these slots with the correct tokens, which can be expressed as: <s> [X] frolicking [Y] the house [Z] Here, the unique sentinel tokens [X] and [Y] replace the masked spans in the input, and the target sequence uses these same tokens to provide the missing text, concluding with a final sentinel [Z].

0

1

Updated 2026-04-16

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences