Learn Before
In a two-model pre-training setup, a small 'generator' model first processes an input sentence by masking some words and then filling those masked positions with its own predictions. The resulting, potentially altered, sentence is then passed to a larger 'discriminator' model. What is the most critical function of the generator's output in this process?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a two-model pre-training setup, a small 'generator' model first processes an input sentence by masking some words and then filling those masked positions with its own predictions. The resulting, potentially altered, sentence is then passed to a larger 'discriminator' model. What is the most critical function of the generator's output in this process?
Evaluating Corrupted Text for Model Training
A small masked language model is used to create a corrupted version of an input text sequence for a subsequent training task. Arrange the steps this model takes to generate the final corrupted sequence in the correct chronological order.
Visual Example of Generator Operation in Replaced Token Detection