Classifying an Alignment Approach
A development team has a large, general-purpose language model. To make it safer for public use, they implement a system where every user prompt is automatically prepended with a detailed set of instructions, such as 'You are a helpful and harmless assistant. Do not generate unsafe content.' This is done for every new conversation without making any permanent changes to the model's underlying structure. Based on this approach, which category of alignment method is the team using? Justify your answer.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Classifying an Alignment Approach
Match each language model development technique with the description that best classifies its stage, application time, and effect on the model's parameters.
A development team is working with a large, pre-existing language model. They need to make the model generate outputs in a specific structured format for a new, specialized task. However, they are operating under a strict constraint: they cannot alter the model's saved parameters due to computational limitations. Which of the following strategies should they employ?