1Cademy - Analyzing Text Corruption Strategies

Learn Before

Corruption Methods for Multi-Sentence Sequences

Short Answer

Analyzing Text Corruption Strategies

Consider the following three methods for altering an input text sequence during the pre-training of a language model:

Randomly replacing 15% of the words in the sequence with a special placeholder symbol.
Randomly changing the order of sentences within the sequence.
Deleting 15% of the words at random positions throughout the sequence.

Analyze these methods and identify which one is uniquely applicable to texts composed of multiple sentences. Justify your choice by explaining why the structure of a multi-sentence text is essential for this specific method to be applied, and why the other two methods do not share this requirement.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Learn Before

Related