Learn Before
Optimization Advantages of Unsupervised Pre-training
A neural network that undergoes an initial training phase on a large, unlabeled dataset often achieves better performance during a subsequent, specific task-based training phase compared to a network trained only on the specific task data from a random starting point. From an optimization perspective, explain two distinct mechanisms through which this initial training phase contributes to a more successful outcome in the second phase.
0
1
Tags
Data Science
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research team trains two identical neural networks on a small, labeled dataset for a specific task.
- Network X is initialized with random weights and trained directly on the labeled data. It achieves high accuracy on the training data but performs poorly on new, unseen data.
- Network Y is first trained on a massive, unlabeled dataset using a label-agnostic objective (e.g., predicting a missing word in a sentence). Then, it is trained on the same small, labeled dataset. It achieves high accuracy and generalizes well to new data.
Which statement best analyzes the underlying reasons for Network Y's superior performance?
Evaluating a Training Strategy
Optimization Advantages of Unsupervised Pre-training