Diagnosing a Denoising Pre-training Strategy
Analyze the team's pre-training approach. What is the primary limitation of using only word deletion for their intended tasks, and what is one different method of corrupting the input that they should introduce to address the observed weaknesses? Explain your reasoning.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Diagnosing a Denoising Pre-training Strategy
A research team is pre-training a text-based model with the goal of making it highly robust and flexible for a wide range of downstream applications, including generating coherent paragraphs and correcting grammatical errors. The model is trained to reconstruct original text from a corrupted version. Which of the following corruption strategies applied during pre-training would be most effective for achieving this goal?
Analysis of Input Corruption Techniques