Learn Before
Optimizing Training with a Limited Budget
A machine learning startup has developed a very large, powerful language model. They have a massive, unfiltered dataset scraped from the web to use for the final stage of training. However, their computational budget is extremely limited, and they cannot afford to train the large model on the entire dataset. Their primary goal is to achieve the maximum possible performance gain for their large model within this strict budget. Describe a data selection strategy they could implement using a smaller, computationally cheaper model to address this challenge. Explain the core principle that makes this strategy effective in their situation.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Ensemble of Small Models for Data Selection
A research team is fine-tuning a very large, computationally expensive language model on a massive, noisy dataset. To optimize their limited budget, they first perform a single pass with the large model over the dataset to calculate the training loss for each data sample. They then train a much smaller, faster model to predict the loss values that the large model assigned. Finally, they use this trained small model to filter the dataset, keeping only the samples predicted to have high loss. Which statement best evaluates the effectiveness of this data selection strategy?
Visual Diagram of Data Selection with a Small Model
You are tasked with curating a high-quality dataset for fine-tuning a large, computationally expensive model from a massive, unfiltered data source. You decide to use a smaller, auxiliary model to help with the selection process. Arrange the following steps into the correct logical sequence for this data selection workflow.
Optimizing Training with a Limited Budget