Essay

Evaluating Training Strategies for a Medical AI

A research lab is developing a language model to act as a medical diagnostic assistant. They are debating between two training approaches:

  1. Outcome-based Supervision: Rewarding the model only when it provides the correct final diagnosis.
  2. Process-based Supervision: Rewarding the model for each correct step in its diagnostic reasoning process (e.g., correctly identifying symptoms, listing potential conditions, and ruling out alternatives).

Evaluate the trade-offs between these two approaches, specifically in the context of this high-stakes medical application. Which approach would you recommend and why?

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science