10 週課程
課程大綱
三軌並行,同班學習。每條軌共享相同的臨床案例與核心模組,但深度、工具與繳交物依背景量身定制。
三軌一體的學習系統
不是三門獨立的課——而是一個統一的醫療 AI 學習系統,有三個入口。A 軌饋入 B,B 饋入 C,C 的治理約束反過來成為 A/B 的設計邊界。
A 軌——AI 原理
程式先行
- 學員: 預醫、資工、理工學生、技術學習者
- 焦點: 用 Colab、PyTorch 和 Claude Code 實作醫療資料集的 ML/DL
- 結業專題: 建構並評估一個臨床 AI 模型或多智能體工作流
工具
B 軌——臨床應用
評估與應用
- 學員: 醫學生、住院醫師、護理師、藥師、研究人員
- 焦點: AI 評估、論文批判、部署準備度評估
- 結業專題: 臨床效用備忘錄、論文批判或部署建議
工具
C 軌——主管與導入
決策與部署
- 學員: 科主任、創新團隊、CMO/CIO、臨床領導者
- 焦點: AI 治理、採購、ROI 建模、組織導入
- 結業專題: 董事會級 AI 策略簡報含供應商評估與治理計畫
工具
每週節奏
每週固定結構——案例先行、原理驅動、論文支撐、討論收束。
臨床案例開場
AI 原理深入探討
臨床應用與使用限制
論文導讀(最新研究)
雙軌分組討論
10 週教學大綱
第 01 週AI in Medicine: History, Hype & the Agent Era
A 軌:建構
講座
From Symbolic AI to the Agent Era
- •AI evolution: symbolic -> ML -> DL -> transformer -> LLM -> agent
- •Medical AI milestones: MYCIN -> CheXNet -> AlphaFold -> Med-PaLM -> agentic workflows
- •2026 SOTA landscape: GPT-5, Claude, Gemini, Llama 4, Qwen 3 -- open vs closed
實作
Lab A1: AI Timeline + First LLM Interaction
Build an interactive AI/medical AI timeline with Claude. Compare 3 LLMs on the same clinical question.
作業
One-page reflection: AI's greatest potential and greatest risk in medicine
推薦論文
Topol, Deep Medicine (2019) -- Ch. 1 overview
B 軌:評估
講座
AI in Your Clinic -- Hype vs Reality
- •Clinical perspective: MYCIN (1976) -> CDSS -> CheXNet -> Med-PaLM -> 2026 agents
- •AI milestones vs actual clinical adoption gap
- •Hype cycle psychology: overestimate short-term, underestimate long-term
實作
Workshop B1: LLM Clinical Task Experience
Same clinical vignette across Claude, ChatGPT, Gemini. Compare DDx lists, recommended tests, and dangerous omissions.
作業
One-page reflection on first LLM clinical task observation
推薦論文
Topol, Deep Medicine (2019) -- clinician perspective on AI
C 軌:部署
講座
The AI Hype Cycle -- Lessons from $62B in Failures
- •IBM Watson Health full case study: $4B investment, promises vs delivery, why it collapsed
- •2026 landscape: $22B+ market, top funded companies, M&A signals
- •Where is your hospital on the hype cycle?
實作
Decision Lab C1: IBM Watson Health Post-Mortem
Analyze timeline, investment decisions, and org failures. Produce a 1-page decision error chain analysis.
作業
Read IBM Watson case + write: the most likely AI procurement mistake at your hospital
推薦論文
Strickland, IBM Watson Health's Rocky Journey (IEEE Spectrum)
第 02 週Healthcare Data: Not a Clean CSV
A 軌:建構
講座
Healthcare Data Reality
- •EHR structure: FHIR, ICD, CPT, LOINC
- •Four data challenges: missingness, label noise, dataset shift, temporal leakage
- •HIPAA / de-identification basics
實作
Lab A2: Medical Data EDA
Explore MIMIC-IV demo subset in Colab. Find missing patterns, plot distributions, identify dataset shift signs.
作業
Data quality memo: 3 EDA issues found + suggested remediation
B 軌:評估
講座
What AI Sees vs What You See
- •EHR pitfalls: note bloat, copy-forward, coding drift, missingness patterns
- •Dataset shift & population mismatch: why AI works at Stanford but breaks at community hospitals
- •Image data traps: scanner variance, label quality, selection bias
實作
Workshop B2: Paper Data Source Audit
Audit a medical AI paper's data: source, labels, inclusion criteria, external validation, bias risks.
作業
Completed data audit table + summary: data quality score (1-10) with justification
C 軌:部署
講座
Data Governance -- The Boring Foundation
- •Data governance maturity model: chaos -> managed -> optimized
- •Legal & commercial barriers: BAA, de-identification, data licensing
- •Case: Cleveland Clinic's 3-year journey to AI-ready data
實作
Decision Lab C2: Vendor Data Due Diligence
Evaluate a radiology AI vendor's data sheet (designed with gaps). Produce a due diligence report.
作業
Data due diligence checklist + pass / flag / reject decision
第 03 週Classical ML & Clinical Prediction
A 軌:建構
講座
Baselines That Actually Work
- •Logistic regression, decision trees, random forest, gradient boosting
- •Medical prediction tasks: readmission, sepsis risk, triage priority
- •Evaluation metrics in clinical context: sensitivity vs specificity vs PPV vs NPV
實作
Lab A3: Build a Readmission Predictor
Train 30-day readmission models with scikit-learn. Run ROC, confusion matrix, calibration plot. Choose clinical threshold.
作業
Technical memo: model results + threshold selection rationale + 3 failure modes
推薦論文
Roberts et al., Common Pitfalls in ML for Healthcare (Nat Med 2021)
B 軌:評估
講座
Metrics That Matter (and Metrics That Lie)
- •Sensitivity/specificity/PPV/NPV shift with prevalence
- •AUC myth: high AUC does not equal clinical utility
- •Net benefit & decision curve analysis: beyond accuracy
實作
Workshop B3: Threshold Trade-off Exercise
Sepsis model at AUC=0.85: pick thresholds for ICU attending vs ED triage nurse. Analyze CDSS alert fatigue scenario.
作業
Threshold decision worksheet + alert fatigue case analysis + written rationale
推薦論文
Van Calster et al., Calibration: the Achilles Heel of Predictive Analytics (BMC Med 2019)
C 軌:部署
講座
AI Metrics -- What the Dashboard Should Show You
- •Non-technical explanation of sensitivity, specificity, PPV, NPV
- •Why 95% accuracy can be useless; AUC in 1 minute
- •Metrics that matter: clinical impact, workflow efficiency, error rate reduction
實作
Decision Lab C3: From High-AUC to Procurement Decision
Sepsis AI with AUC=0.88. Recalculate PPV at your prevalence, assess workflow impact, write procurement recommendation.
作業
Procurement decision memo: buy / defer / reject with conditions
推薦論文
Van Calster et al., Calibration (BMC Med 2019) -- executive-accessible version
第 04 週Deep Learning & Medical Imaging
A 軌:建構
講座
From CheXNet to Foundation Models
- •CNN intuition: convolution, pooling, feature maps; transfer learning
- •CheXNet (2017) -> CheXpert -> MIMIC-CXR -> 2026 radiology foundation models
- •Common pitfalls: shortcut learning, label leakage, scanner-specific bias
實作
Lab A4: CXR Classification Notebook
Fine-tune a pre-trained model for CXR binary classification. Run Grad-CAM to visualize model attention.
作業
Grad-CAM screenshots + analysis: is the model learning the right features?
推薦論文
Rajpurkar et al., CheXNet (2017) + follow-up critique
B 軌:評估
講座
Radiology AI -- Promise and Pitfalls
- •Shortcut learning: models read labels, not pathology
- •External validation reality: NIH performance vs Mumbai deployment
- •Deployment evidence: radiologist+AI vs radiologist alone vs AI alone
實作
Workshop B4: Paper Critique #1 -- CXR AI
Structured critique of a CXR AI paper: problem definition, data quality, method, results, overstated conclusions.
作業
Complete paper critique report using provided template
推薦論文
Seyyed-Kalantari et al., Underdiagnosis Bias in CXR AI (Nat Med 2021)
C 軌:部署
講座
Deploying Radiology AI -- The First 100 Days
- •200+ FDA-cleared radiology AI products, but how many are truly deployed?
- •Pilot design 101: population, duration, success/exit criteria
- •Monitoring: model drift, performance degradation, feedback loops
實作
Decision Lab C4: CXR AI Pilot Charter + CDSS Comparison
Design pilot charters for episodic (CXR triage) vs continuous (CDSS drug interaction) AI. Compare KPIs and rollback plans.
作業
Two pilot charters (CXR + CDSS) + comparison analysis: episodic vs continuous AI deployment
推薦論文
Mayo Clinic AI Deployment Framework
第 05 週Transformers, LLMs & Clinical NLP
A 軌:建構
講座
Language Models in Medicine
- •Attention mechanism -> transformer architecture
- •Pre-training -> fine-tuning -> RLHF -> instruction tuning -> tool use
- •Clinical NLP: note summarization, ICD coding, patient education; hallucination danger
實作
Lab A5: Clinical Text Pipeline + CDSS Prototype
Build clinical note extraction pipeline + LLM-based medication safety checker via Claude API. Compare with rule-based checker.
作業
Pipeline code + 3-case hallucination audit table + CDSS alert accuracy analysis
推薦論文
Singhal et al., Med-PaLM 2 (2023)
B 軌:評估
講座
Clinical LLMs -- Capabilities, Failures & Hallucination
- •USMLE scores vs bedside gap: exams != clinical care
- •Hallucination taxonomy: fabricated citations, plausible-but-wrong reasoning
- •Automation bias: why you unconsciously trust AI
實作
Workshop B5: LLM Head-to-Head Clinical Comparison
5 clinical + 2 patient-facing cases across 3 LLMs. Score DDx completeness, hallucination, safety, and patient-friendliness.
作業
7-case LLM comparison table + summary: should clinician-facing vs patient-facing AI have different standards?
推薦論文
Singhal et al., Med-PaLM 2 (2023) + benchmark vs bedside critique
C 軌:部署
講座
Ambient Scribe & Documentation AI -- The $18B Question
- •Market: Nuance DAX, Abridge, Nabla, Suki -- real value is workflow redesign
- •Evidence gap: which products have RCTs vs testimonials only
- •Risk: hallucination in clinical notes, liability, patient consent, data residency
實作
Decision Lab C5: Documentation AI + Patient Chatbot Vendor Evaluation
Score 3 ambient scribe vendors + 1 patient chatbot vendor. Use LLM to find red flags. Calculate TCO.
作業
Vendor evaluation scorecard (documentation AI + patient chatbot) + final recommendation memo
推薦論文
LLM Chatbot for Care Transitions (Nature Medicine 2026)
第 06 週The Agent Era: Coding Agents & Multi-Agent Systems
A 軌:建構
講座
The Agent Era -- Beyond Prompting
- •Prompting -> tool use -> coding agents -> multi-agent orchestration
- •Claude Code, Codex CLI, Cursor, Windsurf: positioning & capabilities
- •OpenClaw architecture: agent definition, memory, context engineering, task routing
實作
Lab A6: Build a Medical AI Project with Claude Code
Use natural language + Claude Code to build from scratch: sepsis warning pipeline OR LLM-CDSS with RAG + review UI.
作業
Claude Code session log + final project + 1-page reflection: what the agent helped vs deceived
推薦論文
Anthropic, Vibe Physics (2026) -- AI as research collaborator, eager-to-please problem
B 軌:評估
講座
The Clinician as AI Director
- •You don't need to code -- you need to describe problems precisely
- •Vibe Physics case: Harvard professor uses Claude Code for theoretical physics
- •Medical scenarios: natural language -> clinical calculator prototype
實作
Workshop B6: Natural Language -> Clinical Prototype
Direct Claude Code via natural language to build a CHA2DS2-VASc calculator, drug interaction checker, or discharge summary generator.
作業
Clinical tool specification (natural language) + agent output review notes: what's right, wrong, and how you fixed it
推薦論文
Schwartz, Vibe Physics (Anthropic 2026)
C 軌:部署
講座
The AI Stack -- What Executives Must Understand
- •AI stack: foundation model -> application -> workflow -> governance layer
- •Open-source vs closed-source strategy: cost, control, compliance
- •From chatbot -> coding agent -> multi-agent: Claude Code, Codex, OpenClaw
實作
Decision Lab C6: Hospital AI Tooling Stack Workshop
Map your hospital's 4-layer AI stack: model, application, workflow, governance. Define build vs buy vs partner decisions.
作業
Hospital AI tooling stack diagram + build/buy/partner decision matrix
第 07 週Model Evaluation, Reproducibility & Failure Modes
A 軌:建構
講座
When Good Metrics Go Bad
- •AUC trap: high AUC != clinically useful; calibration & net benefit
- •Subgroup performance: disparities across age, sex, race
- •Reproducibility crisis: why paper results fail in your hands
實作
Lab A7: Reproduce & Critique
Subgroup analysis, calibration plot, and failure mode identification on your Week 3/4 models. Compare against published results.
作業
Evaluation report: subgroup results + calibration + failure mode + improvement suggestions
推薦論文
Roberts et al., Common Pitfalls in ML for Healthcare (Nat Med 2021)
B 軌:評估
講座
Advanced Failure Modes in Medical AI
- •Leakage, shortcut learning, subgroup disparity
- •Reproducibility: why you can't replicate paper results
- •p-hacking in ML: model/metric/dataset selection degrees of freedom
實作
Workshop B7: Paper Critique #2 -- Find the Flaw
Critique 2 traditional + 2 CDSS/patient chatbot papers (Nature Medicine 2026, NEJM AI 2025). Find hidden flaws.
作業
Full critique of 2+ papers (including 1 CDSS/chatbot) + 3 major questions for authors
推薦論文
LLM Chatbot for Mental Health Treatment (NEJM AI 2025, RCT)
C 軌:部署
講座
AI ROI -- Beyond the Vendor Slide Deck
- •ROI structure: cost avoidance vs revenue vs quality vs risk reduction
- •Hidden costs: integration, training, workflow redesign, ongoing monitoring
- •Exit strategy: when to shut down an AI tool
實作
Decision Lab C7: ROI Calculator Workshop
Calculate direct/indirect ROI for a 6-month sepsis AI pilot. Run sensitivity analysis on false positive rate changes.
作業
ROI worksheet + sensitivity analysis + go/no-go recommendation
推薦論文
Kaiser Permanente AI ROI Framework
第 08 週Multi-Agent Systems & Hospital Automation
A 軌:建構
講座
Multi-Agent Systems for Healthcare
- •Single agent vs multi-agent: when to decompose tasks
- •OpenClaw deep dive: agent definitions, memory model, context management
- •Medical multi-agent: literature review + analysis + report generation pipelines
實作
Lab A8: OpenClaw Medical AI Pipeline Demo
Interact with a pre-built 4-agent pipeline: paper intake -> method extraction -> PubMed search -> structured review report.
作業
Design your own 3-agent medical workflow: text description + agent definitions + expected I/O
推薦論文
Anthropic, Long-Running Claude for Scientific Computing (2026)
B 軌:評估
講座
Multi-Agent Systems -- What Clinicians Need to Know
- •Why one AI isn't enough: specialized agents for evidence synthesis
- •OpenClaw: methodology agent + bias agent + clinical agent collaboration
- •Human-in-the-loop: what to automate vs what requires your eyes
實作
Workshop B8: OpenClaw Evidence Synthesis Demo
Live demo of multi-agent paper review. Modify an agent prompt and observe output changes. Can this replace your journal club?
作業
Describe a multi-agent clinical workflow you want + why multiple agents + where human review is mandatory
C 軌:部署
講座
Multi-Agent AI in Hospital Operations
- •Agentic workflow scenarios: QA routing, prior auth, clinical trial matching, bed management
- •CDSS agentic pipeline: order intake -> RAG search -> LLM reasoning -> severity routing
- •Risk & governance: what can be fully automated vs human-approved
實作
Decision Lab C8: OpenClaw Hospital Automation Demo
Live demo of QA event routing pipeline. Design an agentic workflow for your hospital process. Estimate FTE replacement ROI.
作業
Agentic workflow design + cost-benefit sketch
推薦論文
Anthropic, Long-Running Claude for Scientific Computing (2026)
第 09 週Regulation, Ethics & Safe Deployment
A 軌:建構
講座
Building Within Boundaries
- •FDA SaMD classification: 510(k) / De Novo / PMA pathways
- •EU AI Act high-risk classification; WHO 2025 guidance on health LLMs
- •Liability, model cards, datasheets for datasets, algorithmic impact assessments
實作
Lab A9: Compliance Constraint Checklist
Apply compliance checklist to your Week 6 Claude Code project: de-identification, intended use, FDA level, monitoring plan.
作業
Completed compliance checklist + revised project scope
推薦論文
FDA SaMD Framework + WHO Guidance on Health LLMs (2025)
B 軌:評估
講座
Using AI Safely in Clinical Practice
- •FDA SaMD framework: what level is your AI tool?
- •WHO 6 principles for health LLMs; EU AI Act implications
- •Liability: if AI is wrong, who is responsible -- you or the vendor?
實作
Workshop B9: Draft a Safe-Use Protocol
Write a safe-use protocol for an AI clinical note summarizer: intended use, human review nodes, incident reporting, exit criteria.
作業
Completed safe-use protocol using provided template
推薦論文
WHO Guidance on Ethics & Governance of LLMs in Health (2025)
C 軌:部署
講座
Building an AI Governance Program
- •Governance 4 pillars: policy, process, people, technology
- •Committee structure: AI governance board, clinical AI review, IT security
- •Patient chatbot governance: consent, scope limits, emergency escalation, adverse event reporting
實作
Decision Lab C9: Governance Playbook Workshop
Build a full AI governance playbook: RACI matrix, risk classification, incident response, patient chatbot governance checklist.
作業
Governance playbook + RACI matrix + patient chatbot governance checklist
推薦論文
FDA SaMD Framework + EU AI Act + WHO LLM Guidance (2025)
第 10 週Capstone: Demo Day
A 軌:建構
講座
Capstone Presentations
- •5-min demo: data -> model -> evaluation pipeline + Claude Code / OpenClaw process
- •Technical memo (2-3 pages): problem, data, methods, results, limitations, compliance
- •Peer review: evaluate another student's project using Track B frameworks
實作
Lab A10: Demo Day + Peer Review
Present your end-to-end medical AI project. Receive structured peer feedback across technical correctness, clinical reasoning, and agent tool usage.
作業
Final deliverables: notebook + technical memo + demo + peer review
B 軌:評估
講座
Capstone Presentations
- •AI Tool Evaluation Memo (3-4 pages): evidence quality, clinical usability, risk, recommendation
- •Trust / Use-with-caution / Reject recommendation + safe-use protocol
- •5-min presentation to simulated hospital committee
實作
Workshop B10: Demo Day + Peer Review
Present your AI tool evaluation to a simulated hospital committee. Receive peer feedback on evidence assessment and clinical judgment.
作業
Final deliverables: evaluation memo + peer review + 5-min presentation
C 軌:部署
講座
Executive AI Strategy Simulation
- •Scenario: 500-bed hospital, $2M AI budget, 12 months to deploy 2 use cases
- •Executive strategy memo (3-5 pages): use cases, vendor eval, roadmap, governance, ROI
- •10-min board presentation + peer challenge from other teams
實作
Decision Lab C10: Board Presentation + Peer Challenge
Present your AI strategy to a simulated board. Defend: why these use cases? What if the first one fails? What are competitors doing?
作業
Final deliverables: executive strategy memo + board presentation + peer challenge
錨點案例
三條軌從不同角度反覆探討這些臨床錨點——建立跨學科的共同語言。
CXR / 放射科 AI
從 CNN 架構到閱讀者研究、工作流整合、採購評估。
EHR / 臨床病歷摘要
從 Transformer 嵌入到幻覺風險、文件支援、供應商評估。
敗血症 / 惡化預測
從風險評分建模到閾值設定、臨床效用、部署監控。
評量方式
我們不考誰最會背 AI 術語。我們評估誰能定義問題、將模型對應任務、評估證據、判斷臨床安全。