Analysis of Denoising Training Objectives
An encoder-decoder model is trained on a denoising task where corrupted text is provided as input. One training approach requires the decoder to reconstruct the entire original, uncorrupted text. A second, 'span-based' approach requires the decoder to generate only the text that was masked in the input, preceded by special tokens indicating which mask is being filled. Explain the key difference in the target sequence length for the decoder in the span-based approach and identify a primary computational benefit of this method.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An encoder-decoder model is being trained with a span-based denoising objective. The encoder is given the following corrupted input text: 'To learn about the solar system, we first study <mask_0> and then move on to <mask_1> planets.' The original, uncorrupted text for the masked spans is '<mask_0>' = 'the Sun' and '<mask_1>' = 'the other'. What should the target output sequence for the decoder be in this training step?
Analysis of Denoising Training Objectives
Debugging a Span-Based Denoising Training Pipeline
Your team is pretraining an internal T5-style enco...
Your company wants one internal model to support m...
Your team is pretraining an internal T5-style mode...
Your team is building a single internal T5-style t...
Diagnosing a T5-Style Model That Ignores Task Prefixes After Span-Denoising Pretraining
Choosing Between Span-Denoising Pretraining and Task-Specific Fine-Tuning in a T5-Style Text-to-Text System
Designing a Unified Text-to-Text Model and Pretraining Objective for Multiple NLP Features
Root-Cause Analysis of a T5-Style Model Producing Fluent but Unfaithful Outputs
Selecting an Architecture and Pretraining Objective for a Unified Internal NLP Service
Post-Pretraining Data Formatting Bug in a T5-Style Text-to-Text Service