Selecting Appropriate Input Corruption Methods
The effectiveness of pre-training an encoder-decoder model is significantly influenced by the choice of input corruption methods. Therefore, selecting the most suitable techniques, which in turn define the training objectives, is a critical step that generally requires careful empirical evaluation and experimentation.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Token Masking as an Input Corruption Method
Token Deletion as an Input Corruption Method
Combining Multiple Corruption Methods in Pre-training
Selecting Appropriate Input Corruption Methods
Token Alteration as an Input Corruption Method
Token Reordering as an Input Corruption Method
Input Corruption Methods for Multi-Sentence Sequences
Input Corruption Methods for Multi-Sentence Sequences
Corruption Methods for Multi-Sentence Sequences
A research team is pre-training an encoder-decoder model using a denoising objective. Their primary goal is to create a model that excels at summarizing long documents, which requires a deep understanding of the text's overall semantic content and logical flow, rather than its exact word-for-word structure. Which of the following input corruption strategies would be most aligned with this specific goal?
You are training an encoder-decoder model with a denoising objective. Match each input corruption method with the primary linguistic capability it is designed to teach the model.
Diagnosing Pre-training Deficiencies
Learn After
Selecting a Pre-training Strategy for a Summarization Model
A research team is pre-training an encoder-decoder model specifically for the task of correcting complex grammatical errors and improving sentence structure in user-generated text. The team wants to select a pre-training objective that will best prepare the model for this downstream task. Which of the following input corruption strategies is most likely to be effective, and why?
Designing an Experiment to Select a Pre-training Objective