Learn Before
A research team is fine-tuning a very large, computationally expensive language model on a massive, noisy dataset. To optimize their limited budget, they first perform a single pass with the large model over the dataset to calculate the training loss for each data sample. They then train a much smaller, faster model to predict the loss values that the large model assigned. Finally, they use this trained small model to filter the dataset, keeping only the samples predicted to have high loss. Which statement best evaluates the effectiveness of this data selection strategy?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Ensemble of Small Models for Data Selection
A research team is fine-tuning a very large, computationally expensive language model on a massive, noisy dataset. To optimize their limited budget, they first perform a single pass with the large model over the dataset to calculate the training loss for each data sample. They then train a much smaller, faster model to predict the loss values that the large model assigned. Finally, they use this trained small model to filter the dataset, keeping only the samples predicted to have high loss. Which statement best evaluates the effectiveness of this data selection strategy?
Visual Diagram of Data Selection with a Small Model
You are tasked with curating a high-quality dataset for fine-tuning a large, computationally expensive model from a massive, unfiltered data source. You decide to use a smaller, auxiliary model to help with the selection process. Arrange the following steps into the correct logical sequence for this data selection workflow.
Optimizing Training with a Limited Budget