Learn Before
Comparative Analysis of an Iterative Labeling Technique
Consider a machine learning approach where a model is trained on a small set of labeled examples, then used to automatically label a much larger set of unlabeled data. The most confident of these new, machine-generated labels are then added to the training set, and the model is retrained. Analyze the potential challenges and benefits of applying this iterative process to two distinct early natural language processing tasks: (1) assigning a specific meaning to a word that has multiple definitions based on its surrounding text, and (2) sorting entire articles into predefined categories like 'sports' or 'technology'.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research team in the late 1990s is tasked with building a system to automatically categorize a massive, newly digitized library of one million news articles into topics like 'Sports', 'Politics', and 'Business'. The team has a very limited budget, allowing them to hire an expert to manually label only 500 articles. Given the constraints and the nature of the task, which of the following approaches represents the most historically successful and pragmatic strategy for them to pursue?
Match each early Natural Language Processing task with the description that best illustrates how a model could be improved using its own predictions on unlabeled data.
Comparative Analysis of an Iterative Labeling Technique