Diagrammatic Example of an Encoder-Decoder Model Trained with a Denoising Autoencoding Objective
This diagram provides an example of training an encoder-decoder model using a denoising autoencoding objective. The process involves several key steps: 1. Corrupted Input to Encoder: The encoder receives a corrupted version of a sentence where some tokens are masked, for instance, [CLS] The puppies are [MASK] outside [MASK] house .. 2. Sequence Reconstruction by Decoder: The encoder generates a hidden state representation of the input, which is then passed to the decoder. The decoder's task is to reconstruct the original, uncorrupted sentence, ⟨s⟩ The puppies are frolicking outside the house ., in an autoregressive manner. 3. Sequence-Level Loss Calculation: To train the model, a loss is calculated over the entire output sequence by accumulating the losses of all tokens, as in standard language modeling. This involves comparing the decoder's generated output with the ground-truth sequence, and the resulting error signal is used to update the model's parameters.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of Denoising Task with Consecutive Token Masking
Span-Based Denoising as an Encoder-Decoder Training Objective
Input Corruption Methods for Denoising Autoencoder Training
Denoising Autoencoder Training Objective
Loss Calculation for Encoder-Decoder Denoising Tasks
Training Efficiency in Denoising Autoencoding
Flexibility of Masked Language Modeling for Encoder-Decoder Training
Example of a Denoising Autoencoder Task for Encoder-Decoder Models
BART Model's Use of Diverse Input Corruption Methods
An encoder-decoder model is being trained using the following example:
- Input to Encoder: "The scientist carefully [MASK] the solution into the beaker."
- Target Output for Decoder: "The scientist carefully poured the solution into the beaker."
Based on this training setup, what is the primary function of the decoder?
Evaluating a Model Training Objective
An encoder-decoder model is being trained with the objective of reconstructing a full, original sentence from an input version where several random words have been removed. What is the most critical function of the encoder's output in this specific training paradigm?
Corrupted Input for Encoder-Decoder Pre-training
Diagrammatic Example of an Encoder-Decoder Model Trained with a Denoising Autoencoding Objective
Learn After
An encoder-decoder model is trained on a denoising task. It receives a corrupted input like
The quick [M] fox jumps [M] the lazy dog.and must generate the original, complete sentenceThe quick brown fox jumps over the lazy dog.The decoder generates the output one word at a time. Why is the training loss typically calculated for each word the decoder generates, rather than just a single loss for the entire completed sentence?Analyzing a Denoising Training Process
A researcher is training an encoder-decoder model using a denoising objective, where the model learns to reconstruct an original sentence from a corrupted version. Arrange the following steps of a single training iteration in the correct chronological order.