Learn Before
Deconstructing a Masked Sentence
A text corruption technique replaces non-overlapping segments of text (spans) with a single [MASK] token. Given the original sentence 'The new AI assistant can write code and summarize articles.' and its corrupted version 'The [MASK] assistant can [MASK] summarize articles.', identify the two specific text spans that were replaced by the [MASK] tokens.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A text corruption technique involves selecting non-overlapping segments of text and replacing each segment with a single
[MASK]token. This technique also allows for selecting zero-length segments, which results in the insertion of a[MASK]token at that position. Given the original sentence 'The quick brown fox jumps over the lazy dog.' and two segments selected for corruption—the segment 'brown fox' and a zero-length segment between 'the' and 'lazy'—what is the resulting corrupted sentence?Deconstructing a Masked Sentence
Analyzing a Text Corruption Process