Impact of Pre-training Data on Instruction Following
A research lab pre-trains two large language models, Model X and Model Y, on different datasets of the same size. After pre-training is complete, and with no further training of any kind, both models are given the same novel prompt: 'Translate the following English sentence into French: The cat is sleeping.' Analyze the case study below and determine which model is more likely to succeed, explaining your reasoning.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A researcher pre-trains a new language model on a vast and diverse dataset of text from the internet. Without any subsequent specialized training, the researcher tests the model with the prompt: 'Summarize the following paragraph in one sentence: [paragraph text]'. The model successfully produces a coherent, one-sentence summary. Which of the following statements provides the most accurate explanation for this capability?
Zero-Shot Generalization from Pre-trained Instruction Knowledge
A large language model's capacity to understand and execute a wide range of tasks based on textual prompts is primarily instilled through a specialized training stage that is separate from and follows its initial, general-purpose language learning phase.
Impact of Pre-training Data on Instruction Following