Learn Before
Example of Sentinel Masking in Denoising Autoencoding
Sentinel masking is a denoising autoencoding task where corrupted spans in the input are replaced by special sentinel tokens, such as [X] and [Y]. For instance, an input sequence might be corrupted to [C] The kitten [X] the [Y]. The encoder-decoder model is then trained to predict only the masked portions, outputting a sequence that reconstructs the missing tokens associated with each sentinel, such as [X] is chasing [Y] ball ..
0
1
Tags
Foundations of Large Language Models
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Training Encoder-Decoder Models with a Denoising Autoencoding Objective
A research team is pre-training a language model with the specific goal of making it highly proficient at understanding long-range contextual relationships and the logical flow of arguments within a paragraph. They use a method where the model learns to restore an original, clean text from a deliberately corrupted version. Which of the following corruption strategies applied to the input text would be most effective for achieving the team's specific goal?
Designing a Robust Text Correction Model
Analyzing the Impact of Input Corruption
Example of Span Masking in Denoising Autoencoding
Example of Sentinel Masking in Denoising Autoencoding