Distinction Between LLMs and BERT in Task Generalization
Generative Large Language Models are distinguished from earlier pre-trained models like BERT by their ability to generalize to new tasks without specific training, a capability known as zero-shot learning. This contrasts with models such as BERT, which are typically fine-tuned to specialize in specific, predefined tasks.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Distinction Between LLMs and BERT in Task Generalization
A large language model undergoes two stages of training. First, it is pre-trained on a vast dataset of internet text. Second, it is fine-tuned on a highly diverse dataset containing thousands of varied instructions and their corresponding correct outputs (e.g., 'Summarize this text...', 'Translate this sentence...', 'Write a poem about...'). The model is then given a completely novel task it has never seen before: 'Convert the following recipe from imperial to metric units.' Which statement best analyzes the likely outcome?
AI Assistant Development Strategy
A large language model develops the ability to perform tasks it has never been explicitly trained on. Arrange the following stages in the correct chronological and causal order that leads to this outcome.
Learn After
A development team has access to two pre-trained language models. Model X is a large, generative model known for its ability to follow general instructions. Model Y is an encoder-based model that achieves state-of-the-art performance on a task after it has been fine-tuned with thousands of task-specific examples. The team's immediate goal is to create a prototype that can summarize legal documents, a task for which they have no training data. Which of the following statements most accurately analyzes the situation?
Model Selection for a Resource-Constrained Startup
Model Suitability for Novel Tasks