Learn Before
Diagnosing a Soft Prompt Training Issue
A machine learning engineer is tasked with adapting a large, pre-trained language model for a specialized legal document summarization task. They prepend a set of learnable, continuous prompt vectors to the input and train the system on a large dataset of legal documents and their corresponding summaries. After training, they observe two outcomes: the model performs very well on the legal summarization task, but its performance on general conversational tasks has significantly worsened. Based on the standard method for training these types of prompts, what is the most probable cause of this negative side effect?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Visual Diagram of Soft Prompt Training
A developer is adapting a large, pre-trained language model for a new task by adding a small set of learnable, continuous vector parameters to the input. During the training process, for each example, a loss is computed by comparing the model's output to the correct output. According to the standard supervised learning approach for this technique, how is this loss used to update the system's parameters?
A machine learning engineer is using a supervised learning approach to train a set of continuous, learnable prompt parameters for a large, pre-trained language model. The goal is to adapt the model for a specific task. During each training step, a loss is calculated based on the difference between the model's prediction and the correct output. Which of the following statements most accurately describes how the system's parameters are handled during this process?
Diagnosing a Soft Prompt Training Issue