1Cademy - Evaluating a Data Augmentation Strategy

Learn Before

Bootstrapping LLMs with Self-Instruct from a Seed Dataset

Short Answer

Evaluating a Data Augmentation Strategy

A development team has a limited set of 200 expert-written examples for training a model to summarize legal documents. To create more training data, they use these examples as few-shot prompts for a powerful, general-purpose language model, asking it to generate thousands of new legal document summaries. Explain the primary advantage and the most significant risk of this data generation method compared to solely using the original 200 expert-written examples.

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related