Learn Before
Predicting from Corrupted Input
During a language model's pre-training, the input sentence 'The quick brown fox jumps over the lazy dog' is modified. The token 'jumps' is selected for prediction and, as part of the training strategy, is replaced by a random token, 'sings'. The model is then fed the corrupted input: 'The quick brown fox sings over the lazy dog'. What is the model's specific objective for the token at the position of 'sings'?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Example of Random Token Replacement in a BERT Input Sequence
In a language model's pre-training, a portion of input tokens selected for prediction are substituted with a completely random token from the vocabulary, rather than always using a special placeholder like
[MASK]. What is the primary analytical justification for this specific strategy?Predicting from Corrupted Input
A language model's pre-training process involves selecting a subset of tokens in an input sequence for prediction. One modification technique applied to these selected tokens is to substitute them with a completely random token from the model's vocabulary. Given the original sequence:
The cat sat on the mat.If the tokensatis chosen for this specific random replacement technique, which of the following is a valid resulting sequence?