Learn Before
Optimizing Chatbot Operational Costs
Based on the case study provided, propose a new system architecture that uses both available models to significantly reduce the average cost per query while maintaining a high level of user satisfaction. Describe the decision-making logic your system would use to route queries.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Visual Diagram of a Cascading Model
A company is developing a system to moderate user-generated content in real-time. They have two predictive models: Model A is small, fast, and has 95% accuracy, while Model B is large, slow, and has 99.5% accuracy. The company observes that over 90% of the content is simple and easily classifiable. To optimize for both cost and performance, they decide to first process every piece of content with Model A. Only if Model A's confidence in its prediction is below a certain threshold is the content then passed to Model B for a final classification. What is the primary advantage of this two-step, conditional approach?
Optimizing Chatbot Operational Costs
Threshold Tuning in Cascading Systems