Learn Before
Evaluating a Modified Pre-training Strategy
Based on the principles of the standard dual-task training objective, what specific type of understanding would this new model likely lack compared to a model trained with both objectives? Explain your reasoning.
0
1
Tags
Data Science
Foundations of Large Language Models Course
Computing Sciences
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
BERT Loss Function
Concurrent Loss Calculation for MLM and NSP
A researcher is pre-training a large language model using a dual-task objective. The model is simultaneously trained on two tasks:
- Predicting randomly obscured words within a given text.
- Determining if two text segments presented together originally appeared consecutively. The final training update is based on the model's combined performance on both tasks. Which of the following statements best analyzes the primary advantage of this specific dual-task approach?
Evaluating a Modified Pre-training Strategy
The original pre-training process for the Bidirectional Encoder Representations from Transformers model involves a dual-task objective where the total loss is the sum of the losses from two distinct tasks. Match each training task to its corresponding description.