A development team starts with a large, pre-trained language model that has a general understanding of text. Their goal is to create a system that can classify customer feedback emails into one of three categories: 'Urgent', 'Standard', or 'Spam'. To adapt the general-purpose model for this specific classification task, what is the most appropriate and standard architectural change they should implement?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A development team starts with a large, pre-trained language model that has a general understanding of text. Their goal is to create a system that can classify customer feedback emails into one of three categories: 'Urgent', 'Standard', or 'Spam'. To adapt the general-purpose model for this specific classification task, what is the most appropriate and standard architectural change they should implement?
Adapting a Pre-trained Model for Different NLP Tasks
A research team has access to a powerful, pre-trained language model that produces a contextualized numerical representation for every token in an input sequence. To solve specific problems, they must add a new, task-specific prediction network (a 'head') on top of this pre-trained base. Match each downstream task with the architectural design of the prediction head best suited to accomplish it.