Differentiating Contributions of Pre-training Objectives
Imagine a language model is pre-trained using two simultaneous objectives: one that involves predicting randomly hidden words in a text, and another that involves determining if two sentences are consecutive. Analyze the distinct types of linguistic understanding the model is expected to gain from each of these two objectives. In your analysis, explain how combining them leads to a more comprehensive language model than using either objective alone.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An AI research team is pre-training a large language model. They design a process where the model is simultaneously optimized on two distinct tasks: 1) predicting randomly hidden words within a sentence, and 2) determining if two sentences presented together originally appeared in sequence in the source text. What is the most likely reason for this dual-task training approach?
Differentiating Contributions of Pre-training Objectives
Diagnosing a Language Model's Training Deficiency