Learn Before
Identifying an Input Alteration Procedure
An engineer is preparing data to train a language model. They start with the input sentence: 'The model learns to predict the original text from a corrupted version.' The model is then fed the following altered version: 'The model learns to [MASK] the original text from a [MASK] version.' Based on this transformation, describe the specific procedure that was used to alter the original sentence. Explain the key characteristics of this procedure.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Example Comparison of Token Masking and Token Deletion
Span Masking
A common technique to create a 'noisy' version of a text sequence for model training involves randomly selecting individual words and replacing each one with a special marker, such as
[MASK]. Given the original sentence: 'The quick brown fox jumps over the lazy dog.', which of the following options correctly demonstrates this specific technique?Identifying an Input Alteration Procedure
A data scientist is preparing text for a model training process. The goal is to corrupt the input by replacing individual words with a special
[MASK]marker, while keeping the total number of words (including the markers) the same as the original. Given the original sentence: 'The model must predict the original words from the altered input.', which of the following sentences correctly applies this specific technique?