A development team has a small, high-quality dataset for training a sentiment analysis model. To improve the model's performance without collecting more user data, they use a powerful, general-purpose language model to paraphrase each existing example, generating five new variations for every original sentence while preserving the sentiment label. This process of creating synthetic training examples is most directly analogous to which traditional machine learning practice, and why?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Analysis of Dataset Expansion Strategies
A development team has a small, high-quality dataset for training a sentiment analysis model. To improve the model's performance without collecting more user data, they use a powerful, general-purpose language model to paraphrase each existing example, generating five new variations for every original sentence while preserving the sentiment label. This process of creating synthetic training examples is most directly analogous to which traditional machine learning practice, and why?
Evaluating a Synthetic Data Generation Strategy