Learn Before
A team is preparing a large, diverse text dataset to train a powerful new language model. To improve the final model's quality, they first use a smaller, pre-existing language model to score each document in the dataset. Documents that receive a very low score from this smaller model are removed. Which of the following documents is most likely to be removed from the dataset during this filtering process?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Likelihood and Cross-Entropy as Data Filtering Criteria
Weak-to-Strong Generalization via Fine-Tuning on Weak Model Data
Optimizing Training Data for a Medical Language Model
A team is preparing a large, diverse text dataset to train a powerful new language model. To improve the final model's quality, they first use a smaller, pre-existing language model to score each document in the dataset. Documents that receive a very low score from this smaller model are removed. Which of the following documents is most likely to be removed from the dataset during this filtering process?
You are tasked with curating a high-quality dataset for training a large language model. You decide to use a smaller, less powerful model to help filter an initial, large collection of text documents. Arrange the following steps of this data filtering process in the correct logical order.