Optimizing a Text Classification Pipeline
A team is building a text classifier for customer support tickets using a large, pre-trained language model. The model generates a vector representation for each ticket, which is then passed to a simple linear classifier for the final prediction. The system's accuracy is lower than desired.
Engineer A proposes replacing the simple linear classifier with a more complex, multi-layer neural network. Engineer B proposes focusing on fine-tuning the large, pre-trained language model itself on the ticket data.
Evaluate these two proposals. Which approach is generally considered the more impactful first step for improving performance in such a system, and why? Your explanation should address the distinct roles of the main model and the final prediction layer.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Optimizing a Text Classification Pipeline
A team is developing a text classification system to sort user feedback into 30 categories. They use a large pre-trained model to convert each piece of feedback into a single, information-rich numerical vector. For the final step of mapping this vector to one of the 30 categories, they are considering different options. Given that the team is working with a limited computational budget and a short project timeline, which of the following choices for the final classification layer is most justifiable?
In a text classification system that uses a large pre-trained model to generate a single vector representation for an input text, the final component that maps this vector to a class label must be a multi-layer neural network to maintain high performance.