Diagnosing a Language Model's Training Deficiency
Based on the model's performance described in the case study, which common pre-training objective was likely omitted or given insufficient weight during its training? Explain why including this objective as an auxiliary task would have addressed the observed performance gap.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An AI research team is pre-training a large language model. They design a process where the model is simultaneously optimized on two distinct tasks: 1) predicting randomly hidden words within a sentence, and 2) determining if two sentences presented together originally appeared in sequence in the source text. What is the most likely reason for this dual-task training approach?
Differentiating Contributions of Pre-training Objectives
Diagnosing a Language Model's Training Deficiency