Learn Before
Evaluating a Training Strategy
Based on the scenario below, evaluate the engineer's proposal to first train the model on a large, publicly available, unlabeled text corpus before fine-tuning it on their specific, smaller labeled dataset. Justify your evaluation by explaining the two primary ways this initial training phase could address the observed problems.
0
1
Tags
Data Science
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research team trains two identical neural networks on a small, labeled dataset for a specific task.
- Network X is initialized with random weights and trained directly on the labeled data. It achieves high accuracy on the training data but performs poorly on new, unseen data.
- Network Y is first trained on a massive, unlabeled dataset using a label-agnostic objective (e.g., predicting a missing word in a sentence). Then, it is trained on the same small, labeled dataset. It achieves high accuracy and generalizes well to new data.
Which statement best analyzes the underlying reasons for Network Y's superior performance?
Evaluating a Training Strategy
Optimization Advantages of Unsupervised Pre-training