Learn Before
Historical Applications of Self-Training
Self-training has a history of successful application in various Natural Language Processing domains. Two prominent early examples include its use in word sense disambiguation and for the classification of documents.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Related
Historical Applications of Self-Training
Comparison of Self-Supervised Pre-training and Self-Training
A machine learning team is implementing a self-training procedure to improve a text classification model. They begin by training an initial model on a small, high-quality labeled dataset. They then use this model to predict labels for a vast collection of unlabeled text, creating 'pseudo labels'. Finally, they retrain the model on a combination of the original labeled data and the newly pseudo-labeled data. Which of the following describes the most critical risk inherent to this self-training approach?
A machine learning team has a small set of high-quality labeled data and a very large set of unlabeled data. They decide to use an iterative approach to improve their model's performance. Arrange the core steps of this process in the correct chronological order.
Evaluating a Model Training Strategy
Learn After
A research team in the late 1990s is tasked with building a system to automatically categorize a massive, newly digitized library of one million news articles into topics like 'Sports', 'Politics', and 'Business'. The team has a very limited budget, allowing them to hire an expert to manually label only 500 articles. Given the constraints and the nature of the task, which of the following approaches represents the most historically successful and pragmatic strategy for them to pursue?
Match each early Natural Language Processing task with the description that best illustrates how a model could be improved using its own predictions on unlabeled data.
Comparative Analysis of an Iterative Labeling Technique