Learn Before
Evaluating Training Data Strategies for Model Performance
An AI development team is fine-tuning a large language model to be a helpful assistant. They are considering two different strategies for creating the training dataset:
- Strategy A: Generate a very large dataset of 1 million examples, but focus on a single, high-frequency task: summarizing news articles.
- Strategy B: Generate a smaller but highly diverse dataset of 50,000 examples that cover hundreds of different tasks, such as creative writing, coding, question-answering, and planning.
Evaluate these two strategies in terms of their likely impact on the model's ability to correctly follow new, unseen instructions after training. Which strategy is superior for achieving this goal, and why? Justify your answer by explaining the trade-offs between dataset size, diversity, and the desired outcome.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Language Model Performance Analysis
An AI development team fine-tunes two language models. Model A is trained on 100,000 examples of a single, narrow task: rephrasing sentences into five specific styles. Model B is trained on 10,000 examples covering a wide variety of tasks (e.g., summarization, translation, creative writing). When both models are tested on a completely new, unseen instruction like 'generate a grocery list for a three-course Italian meal,' which outcome is most likely?
Evaluating Training Data Strategies for Model Performance