Input Corruption Methods for Denoising Autoencoder Training
When training encoder-decoder models with a denoising autoencoding objective, various methods can be used to corrupt the input data. This process is crucial for training the model to reconstruct the original input. Besides the common technique of masking tokens, other corruption strategies include altering tokens to different ones or reordering them within the sequence.
0
1
References
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of Denoising Task with Consecutive Token Masking
Span-Based Denoising as an Encoder-Decoder Training Objective
Input Corruption Methods for Denoising Autoencoder Training
Denoising Autoencoder Training Objective
Loss Calculation for Encoder-Decoder Denoising Tasks
Training Efficiency in Denoising Autoencoding
Flexibility of Masked Language Modeling for Encoder-Decoder Training
Example of a Denoising Autoencoder Task for Encoder-Decoder Models
BART Model's Use of Diverse Input Corruption Methods
An encoder-decoder model is being trained using the following example:
- Input to Encoder: "The scientist carefully [MASK] the solution into the beaker."
- Target Output for Decoder: "The scientist carefully poured the solution into the beaker."
Based on this training setup, what is the primary function of the decoder?
Evaluating a Model Training Objective
An encoder-decoder model is being trained with the objective of reconstructing a full, original sentence from an input version where several random words have been removed. What is the most critical function of the encoder's output in this specific training paradigm?
Corrupted Input for Encoder-Decoder Pre-training
Diagrammatic Example of an Encoder-Decoder Model Trained with a Denoising Autoencoding Objective
Learn After
Token Masking as an Input Corruption Method
Token Deletion as an Input Corruption Method
Combining Multiple Corruption Methods in Pre-training
Selecting Appropriate Input Corruption Methods
Token Alteration as an Input Corruption Method
Token Reordering as an Input Corruption Method
Input Corruption Methods for Multi-Sentence Sequences
Input Corruption Methods for Multi-Sentence Sequences
Corruption Methods for Multi-Sentence Sequences
A research team is pre-training an encoder-decoder model using a denoising objective. Their primary goal is to create a model that excels at summarizing long documents, which requires a deep understanding of the text's overall semantic content and logical flow, rather than its exact word-for-word structure. Which of the following input corruption strategies would be most aligned with this specific goal?
You are training an encoder-decoder model with a denoising objective. Match each input corruption method with the primary linguistic capability it is designed to teach the model.
Diagnosing Pre-training Deficiencies