1Cademy - Small Model-Based Data Selection

Learn Before

Data Selection and Filtering Methods for Fine-Tuning

Activity (Process)

Small Model-Based Data Selection

This data selection technique uses a smaller, auxiliary model to filter or curate a dataset that will be used to train a larger model. The process involves the small model performing 'Data Selection' to create a refined dataset of high-quality samples. This curated dataset is then fed to the larger model for its training phase, where loss is computed and parameters are updated based on the selected data.