A research team aims to pre-train a sequence-to-sequence model for various text generation tasks using a massive, unlabeled text corpus. Their proposed training strategy is as follows: for each document, they will randomly split it into an initial segment and a concluding segment. The model's encoder will process the entire initial segment at once to form a contextual understanding. The model will then be trained to use its decoder to generate the concluding segment, conditioned on the encoder's output. Which of the following statements provides the most accurate evaluation of this strategy for the team's objective?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research team aims to pre-train a sequence-to-sequence model for various text generation tasks using a massive, unlabeled text corpus. Their proposed training strategy is as follows: for each document, they will randomly split it into an initial segment and a concluding segment. The model's encoder will process the entire initial segment at once to form a contextual understanding. The model will then be trained to use its decoder to generate the concluding segment, conditioned on the encoder's output. Which of the following statements provides the most accurate evaluation of this strategy for the team's objective?
Comparison of Prefix Language Modeling and Causal Language Modeling
You are preparing a single training example for an encoder-decoder model using a self-supervised objective on a large, unlabeled text document. Arrange the following actions into the correct chronological sequence for one complete training step.
Analyzing a Flawed Pre-training Strategy