Example of Token Masking in a BERT Input Sequence
An example of token masking can be seen with a two-sentence input prepared for BERT. The original sequence, [CLS] It is raining . [SEP] I need an umbrella . [SEP], is modified by replacing selected tokens with the [MASK] symbol. This results in the masked sequence: [CLS] It is [MASK] . [SEP] I need [MASK] umbrella . [SEP], where the model's task is to predict the original words 'raining' and 'an'.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of Token Masking in a BERT Input Sequence
During a language model's pre-training, a specific strategy is used to alter words that have been chosen for the model to predict. If 10,000 words in a dataset have been chosen for this prediction task, and the strategy dictates that 80% of these chosen words are replaced with a special placeholder symbol, approximately how many of the 10,000 chosen words will be replaced by this symbol?
Verifying a Language Model's Pre-training Data
Consider a standard pre-training procedure for a language model where 15% of all tokens in an input are first selected for prediction. Of these selected tokens, 80% are then replaced with a special
[MASK]symbol. Based on this procedure, it is guaranteed that for any given input sequence of 1,000 tokens, exactly 120 tokens will be replaced with the[MASK]symbol.Example of Token Masking in a BERT Input Sequence
Example of an Unchanged Token in a BERT Input Sequence
Example of Random Token Replacement in a BERT Input Sequence
A language model is designed to process pairs of sentences by concatenating them into a single sequence. This model requires a special token at the beginning of the entire sequence to be used for classification tasks, and another special token to mark the boundary between the two sentences and to signify the end of the sequence. Given the two sentences 'The sky is blue.' and 'The grass is green.', which of the following options correctly formats them as a single input sequence for this model?
Debugging Model Input Formatting
Analyzing Input Sequence Structure
Learn After
A language model is being prepared to predict missing words in a two-sentence input. The original input is:
[CLS] The sun is shining brightly . [SEP] Birds are singing loudly . [SEP]. If the words 'shining' and 'singing' are selected for masking, which of the following represents the correctly modified input sequence?Reconstructing an Original Input Sequence
A researcher is preparing the input sequence
[CLS] The cat sat on the mat . [SEP]for a masked language modeling task. To train the model to predict the word 'sat', the researcher replaces 'sat' with the special token ____.