Learn Before
Multiple Choice

A machine learning team is implementing a self-training procedure to improve a text classification model. They begin by training an initial model on a small, high-quality labeled dataset. They then use this model to predict labels for a vast collection of unlabeled text, creating 'pseudo labels'. Finally, they retrain the model on a combination of the original labeled data and the newly pseudo-labeled data. Which of the following describes the most critical risk inherent to this self-training approach?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science