Input Corruption Methods for Multi-Sentence Sequences
For input sequences that contain multiple sentences, additional corruption techniques can be employed beyond standard token-level modifications. The BART model, for instance, incorporates two such methods specifically designed for multi-sentence text.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Token Masking as an Input Corruption Method
Token Deletion as an Input Corruption Method
Combining Multiple Corruption Methods in Pre-training
Selecting Appropriate Input Corruption Methods
Token Alteration as an Input Corruption Method
Token Reordering as an Input Corruption Method
Input Corruption Methods for Multi-Sentence Sequences
Input Corruption Methods for Multi-Sentence Sequences
Corruption Methods for Multi-Sentence Sequences
A research team is pre-training an encoder-decoder model using a denoising objective. Their primary goal is to create a model that excels at summarizing long documents, which requires a deep understanding of the text's overall semantic content and logical flow, rather than its exact word-for-word structure. Which of the following input corruption strategies would be most aligned with this specific goal?
You are training an encoder-decoder model with a denoising objective. Match each input corruption method with the primary linguistic capability it is designed to teach the model.
Diagnosing Pre-training Deficiencies
Learn After
Selecting a Training Method for a Summarization Model
When training a model on a document with multiple sentences, what is the primary advantage of corrupting the input by randomly shuffling the order of entire sentences, as opposed to simply reordering individual tokens across the entire document?
Comparing Multi-Sentence Corruption Techniques