Learn Before
Threshold Tuning in Cascading Systems
In a system that uses a small, fast model followed by a large, slow model only when necessary, a 'confidence threshold' is used to decide whether to accept the small model's output. Describe the two opposing risks a system designer must balance when setting this threshold.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Visual Diagram of a Cascading Model
A company is developing a system to moderate user-generated content in real-time. They have two predictive models: Model A is small, fast, and has 95% accuracy, while Model B is large, slow, and has 99.5% accuracy. The company observes that over 90% of the content is simple and easily classifiable. To optimize for both cost and performance, they decide to first process every piece of content with Model A. Only if Model A's confidence in its prediction is below a certain threshold is the content then passed to Model B for a final classification. What is the primary advantage of this two-step, conditional approach?
Optimizing Chatbot Operational Costs
Threshold Tuning in Cascading Systems