Learn Before
Process of Adapting a Supervised Pre-trained Model
Adapting a supervised pre-trained model for a new downstream task involves constructing a new classification system. This is achieved by integrating the pre-trained model, such as a sequence model, with a new classification layer tailored to the specific task (e.g., subjectivity detection). The parameters of this new system are then fine-tuned using task-specific labeled data to optimize its performance. Once adjusted, the model is employed to classify new sequences for the downstream task.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Process of Adapting a Supervised Pre-trained Model
Advantages of Supervised Pre-training
Disadvantages of Supervised Pre-training
Example of a Supervised Pre-training Task
A startup is building a system to automatically categorize legal contracts into specific sub-types (e.g., 'lease agreement', 'employment contract', 'non-disclosure agreement'). They have a very small, private dataset of 500 labeled contracts. Their proposed strategy is to first train a large neural network on a massive, publicly available dataset of millions of labeled news articles, classifying them by topic (e.g., 'sports', 'politics', 'technology'). After this initial training, they plan to adapt the model to their legal contract categorization task. What is the most significant weakness of this proposed pre-training approach for their specific goal?
A machine learning engineer wants to use a supervised pre-training approach to build a model that can detect toxic comments online. Arrange the following steps in the correct chronological order to reflect this process.
Evaluating a Pre-training Strategy for Scientific Text
Assumption of Supervised Pre-training
Learn After
A research team has a language model that was pre-trained on a large dataset of movie reviews to classify them as having 'positive' or 'negative' sentiment. They now want to use this model for a new project: classifying short medical summaries into one of three categories: 'Cardiology', 'Neurology', or 'Oncology'. Which of the following describes the most effective procedure for adapting the pre-trained model to this new task?
Troubleshooting a Model Adaptation Process
You are tasked with adapting a language model, which was initially trained to identify the sentiment of customer reviews, for a new purpose: classifying news articles into categories like 'Sports', 'Technology', and 'Politics'. Arrange the following steps in the correct chronological order to successfully adapt the model for this new classification task.