Evaluating Training Strategies for a Medical AI
A research lab is developing a language model to act as a medical diagnostic assistant. They are debating between two training approaches:
- Outcome-based Supervision: Rewarding the model only when it provides the correct final diagnosis.
- Process-based Supervision: Rewarding the model for each correct step in its diagnostic reasoning process (e.g., correctly identifying symptoms, listing potential conditions, and ruling out alternatives).
Evaluate the trade-offs between these two approaches, specifically in the context of this high-stakes medical application. Which approach would you recommend and why?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A team is training a language model to solve complex, multi-step word problems. They observe that while the model frequently provides the correct final answer, its step-by-step explanation often contains logical fallacies or incorrect calculations that coincidentally cancel each other out. Which of the following training strategies would be most effective at correcting the model's flawed reasoning process, rather than just its final output?
Evaluating Training Strategies for a Medical AI
Comparing AI Tutor Training Methodologies
Solution as a Sequence of Reasoning Steps