Learn Before
A company deploys a large, pre-trained language model for its public-facing chatbot. Due to immense computational costs, they cannot alter the model's core programming or retrain it. To ensure the chatbot's responses are consistently helpful and harmless, they implement a new system. This system works by having the original model generate five different potential answers for every user query. A second, much smaller, specialized model then rapidly evaluates these five answers based on safety and helpfulness criteria, and only the highest-scoring answer is displayed to the user. Which principle does this company's strategy best illustrate?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.5 Inference - Foundations of Large Language Models
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Prompting as a Form of Inference-Time Alignment
Rescoring and Reranking for Inference-Time Alignment
A company deploys a large, pre-trained language model for its public-facing chatbot. Due to immense computational costs, they cannot alter the model's core programming or retrain it. To ensure the chatbot's responses are consistently helpful and harmless, they implement a new system. This system works by having the original model generate five different potential answers for every user query. A second, much smaller, specialized model then rapidly evaluates these five answers based on safety and helpfulness criteria, and only the highest-scoring answer is displayed to the user. Which principle does this company's strategy best illustrate?
Choosing an LLM Alignment Strategy
System Information in Prompts
LLM Deployment Strategy for a Startup