The development of powerful, general-purpose language models was significantly accelerated by a key combination of an architectural innovation and a learning strategy. Which statement best analyzes the distinct yet complementary roles of these two components in this breakthrough?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research lab is developing a new large-scale language model. They have access to a state-of-the-art neural architecture designed to effectively process long sequences of text. They are debating between two training strategies:
- Strategy A: Train the model from scratch on a high-quality, human-labeled dataset of 1 million examples specifically designed for question-answering.
- Strategy B: First, train the model on a massive, unlabeled corpus of 1 trillion words from the internet with the objective of predicting the next word in a sentence. Then, optionally, adapt it to specific tasks.
Which strategy is more likely to produce a powerful, general-purpose model capable of a wide range of language understanding and generation tasks, and why?
The Engine of Modern AI: Architecture and Learning
The development of powerful, general-purpose language models was significantly accelerated by a key combination of an architectural innovation and a learning strategy. Which statement best analyzes the distinct yet complementary roles of these two components in this breakthrough?