10 週課程

課程大綱

三軌並行,同班學習。每條軌共享相同的臨床案例與核心模組,但深度、工具與繳交物依背景量身定制。

三軌一體的學習系統

不是三門獨立的課——而是一個統一的醫療 AI 學習系統,有三個入口。A 軌饋入 B,B 饋入 C,C 的治理約束反過來成為 A/B 的設計邊界。

建構

A 軌——AI 原理

程式先行

  • 學員: 預醫、資工、理工學生、技術學習者
  • 焦點: 用 Colab、PyTorch 和 Claude Code 實作醫療資料集的 ML/DL
  • 結業專題: 建構並評估一個臨床 AI 模型或多智能體工作流

工具

Google ColabPyTorchscikit-learnHugging FaceClaude CodeOpenClaw
評估

B 軌——臨床應用

評估與應用

  • 學員: 醫學生、住院醫師、護理師、藥師、研究人員
  • 焦點: AI 評估、論文批判、部署準備度評估
  • 結業專題: 臨床效用備忘錄、論文批判或部署建議

工具

無程式碼樣板論文批判框架LLM 比較工具決策儀表板
部署

C 軌——主管與導入

決策與部署

  • 學員: 科主任、創新團隊、CMO/CIO、臨床領導者
  • 焦點: AI 治理、採購、ROI 建模、組織導入
  • 結業專題: 董事會級 AI 策略簡報含供應商評估與治理計畫

工具

ROI 計算器供應商評估矩陣治理檢查表試點路線圖樣板

每週節奏

每週固定結構——案例先行、原理驅動、論文支撐、討論收束。

20 分鐘

臨床案例開場

25 分鐘

AI 原理深入探討

25 分鐘

臨床應用與使用限制

15 分鐘

論文導讀(最新研究)

15–30 分鐘

雙軌分組討論

10 週教學大綱

第 01 週

AI in Medicine: History, Hype & the Agent Era

A 軌:建構

講座

From Symbolic AI to the Agent Era

  • AI evolution: symbolic -> ML -> DL -> transformer -> LLM -> agent
  • Medical AI milestones: MYCIN -> CheXNet -> AlphaFold -> Med-PaLM -> agentic workflows
  • 2026 SOTA landscape: GPT-5, Claude, Gemini, Llama 4, Qwen 3 -- open vs closed

實作

Lab A1: AI Timeline + First LLM Interaction

Build an interactive AI/medical AI timeline with Claude. Compare 3 LLMs on the same clinical question.

作業

One-page reflection: AI's greatest potential and greatest risk in medicine

推薦論文

Topol, Deep Medicine (2019) -- Ch. 1 overview

B 軌:評估

講座

AI in Your Clinic -- Hype vs Reality

  • Clinical perspective: MYCIN (1976) -> CDSS -> CheXNet -> Med-PaLM -> 2026 agents
  • AI milestones vs actual clinical adoption gap
  • Hype cycle psychology: overestimate short-term, underestimate long-term

實作

Workshop B1: LLM Clinical Task Experience

Same clinical vignette across Claude, ChatGPT, Gemini. Compare DDx lists, recommended tests, and dangerous omissions.

作業

One-page reflection on first LLM clinical task observation

推薦論文

Topol, Deep Medicine (2019) -- clinician perspective on AI

C 軌:部署

講座

The AI Hype Cycle -- Lessons from $62B in Failures

  • IBM Watson Health full case study: $4B investment, promises vs delivery, why it collapsed
  • 2026 landscape: $22B+ market, top funded companies, M&A signals
  • Where is your hospital on the hype cycle?

實作

Decision Lab C1: IBM Watson Health Post-Mortem

Analyze timeline, investment decisions, and org failures. Produce a 1-page decision error chain analysis.

作業

Read IBM Watson case + write: the most likely AI procurement mistake at your hospital

推薦論文

Strickland, IBM Watson Health's Rocky Journey (IEEE Spectrum)

第 02 週

Healthcare Data: Not a Clean CSV

A 軌:建構

講座

Healthcare Data Reality

  • EHR structure: FHIR, ICD, CPT, LOINC
  • Four data challenges: missingness, label noise, dataset shift, temporal leakage
  • HIPAA / de-identification basics

實作

Lab A2: Medical Data EDA

Explore MIMIC-IV demo subset in Colab. Find missing patterns, plot distributions, identify dataset shift signs.

作業

Data quality memo: 3 EDA issues found + suggested remediation

B 軌:評估

講座

What AI Sees vs What You See

  • EHR pitfalls: note bloat, copy-forward, coding drift, missingness patterns
  • Dataset shift & population mismatch: why AI works at Stanford but breaks at community hospitals
  • Image data traps: scanner variance, label quality, selection bias

實作

Workshop B2: Paper Data Source Audit

Audit a medical AI paper's data: source, labels, inclusion criteria, external validation, bias risks.

作業

Completed data audit table + summary: data quality score (1-10) with justification

C 軌:部署

講座

Data Governance -- The Boring Foundation

  • Data governance maturity model: chaos -> managed -> optimized
  • Legal & commercial barriers: BAA, de-identification, data licensing
  • Case: Cleveland Clinic's 3-year journey to AI-ready data

實作

Decision Lab C2: Vendor Data Due Diligence

Evaluate a radiology AI vendor's data sheet (designed with gaps). Produce a due diligence report.

作業

Data due diligence checklist + pass / flag / reject decision

第 03 週

Classical ML & Clinical Prediction

A 軌:建構

講座

Baselines That Actually Work

  • Logistic regression, decision trees, random forest, gradient boosting
  • Medical prediction tasks: readmission, sepsis risk, triage priority
  • Evaluation metrics in clinical context: sensitivity vs specificity vs PPV vs NPV

實作

Lab A3: Build a Readmission Predictor

Train 30-day readmission models with scikit-learn. Run ROC, confusion matrix, calibration plot. Choose clinical threshold.

作業

Technical memo: model results + threshold selection rationale + 3 failure modes

推薦論文

Roberts et al., Common Pitfalls in ML for Healthcare (Nat Med 2021)

B 軌:評估

講座

Metrics That Matter (and Metrics That Lie)

  • Sensitivity/specificity/PPV/NPV shift with prevalence
  • AUC myth: high AUC does not equal clinical utility
  • Net benefit & decision curve analysis: beyond accuracy

實作

Workshop B3: Threshold Trade-off Exercise

Sepsis model at AUC=0.85: pick thresholds for ICU attending vs ED triage nurse. Analyze CDSS alert fatigue scenario.

作業

Threshold decision worksheet + alert fatigue case analysis + written rationale

推薦論文

Van Calster et al., Calibration: the Achilles Heel of Predictive Analytics (BMC Med 2019)

C 軌:部署

講座

AI Metrics -- What the Dashboard Should Show You

  • Non-technical explanation of sensitivity, specificity, PPV, NPV
  • Why 95% accuracy can be useless; AUC in 1 minute
  • Metrics that matter: clinical impact, workflow efficiency, error rate reduction

實作

Decision Lab C3: From High-AUC to Procurement Decision

Sepsis AI with AUC=0.88. Recalculate PPV at your prevalence, assess workflow impact, write procurement recommendation.

作業

Procurement decision memo: buy / defer / reject with conditions

推薦論文

Van Calster et al., Calibration (BMC Med 2019) -- executive-accessible version

第 04 週

Deep Learning & Medical Imaging

A 軌:建構

講座

From CheXNet to Foundation Models

  • CNN intuition: convolution, pooling, feature maps; transfer learning
  • CheXNet (2017) -> CheXpert -> MIMIC-CXR -> 2026 radiology foundation models
  • Common pitfalls: shortcut learning, label leakage, scanner-specific bias

實作

Lab A4: CXR Classification Notebook

Fine-tune a pre-trained model for CXR binary classification. Run Grad-CAM to visualize model attention.

作業

Grad-CAM screenshots + analysis: is the model learning the right features?

推薦論文

Rajpurkar et al., CheXNet (2017) + follow-up critique

B 軌:評估

講座

Radiology AI -- Promise and Pitfalls

  • Shortcut learning: models read labels, not pathology
  • External validation reality: NIH performance vs Mumbai deployment
  • Deployment evidence: radiologist+AI vs radiologist alone vs AI alone

實作

Workshop B4: Paper Critique #1 -- CXR AI

Structured critique of a CXR AI paper: problem definition, data quality, method, results, overstated conclusions.

作業

Complete paper critique report using provided template

推薦論文

Seyyed-Kalantari et al., Underdiagnosis Bias in CXR AI (Nat Med 2021)

C 軌:部署

講座

Deploying Radiology AI -- The First 100 Days

  • 200+ FDA-cleared radiology AI products, but how many are truly deployed?
  • Pilot design 101: population, duration, success/exit criteria
  • Monitoring: model drift, performance degradation, feedback loops

實作

Decision Lab C4: CXR AI Pilot Charter + CDSS Comparison

Design pilot charters for episodic (CXR triage) vs continuous (CDSS drug interaction) AI. Compare KPIs and rollback plans.

作業

Two pilot charters (CXR + CDSS) + comparison analysis: episodic vs continuous AI deployment

推薦論文

Mayo Clinic AI Deployment Framework

第 05 週

Transformers, LLMs & Clinical NLP

A 軌:建構

講座

Language Models in Medicine

  • Attention mechanism -> transformer architecture
  • Pre-training -> fine-tuning -> RLHF -> instruction tuning -> tool use
  • Clinical NLP: note summarization, ICD coding, patient education; hallucination danger

實作

Lab A5: Clinical Text Pipeline + CDSS Prototype

Build clinical note extraction pipeline + LLM-based medication safety checker via Claude API. Compare with rule-based checker.

作業

Pipeline code + 3-case hallucination audit table + CDSS alert accuracy analysis

推薦論文

Singhal et al., Med-PaLM 2 (2023)

B 軌:評估

講座

Clinical LLMs -- Capabilities, Failures & Hallucination

  • USMLE scores vs bedside gap: exams != clinical care
  • Hallucination taxonomy: fabricated citations, plausible-but-wrong reasoning
  • Automation bias: why you unconsciously trust AI

實作

Workshop B5: LLM Head-to-Head Clinical Comparison

5 clinical + 2 patient-facing cases across 3 LLMs. Score DDx completeness, hallucination, safety, and patient-friendliness.

作業

7-case LLM comparison table + summary: should clinician-facing vs patient-facing AI have different standards?

推薦論文

Singhal et al., Med-PaLM 2 (2023) + benchmark vs bedside critique

C 軌:部署

講座

Ambient Scribe & Documentation AI -- The $18B Question

  • Market: Nuance DAX, Abridge, Nabla, Suki -- real value is workflow redesign
  • Evidence gap: which products have RCTs vs testimonials only
  • Risk: hallucination in clinical notes, liability, patient consent, data residency

實作

Decision Lab C5: Documentation AI + Patient Chatbot Vendor Evaluation

Score 3 ambient scribe vendors + 1 patient chatbot vendor. Use LLM to find red flags. Calculate TCO.

作業

Vendor evaluation scorecard (documentation AI + patient chatbot) + final recommendation memo

推薦論文

LLM Chatbot for Care Transitions (Nature Medicine 2026)

第 06 週

The Agent Era: Coding Agents & Multi-Agent Systems

A 軌:建構

講座

The Agent Era -- Beyond Prompting

  • Prompting -> tool use -> coding agents -> multi-agent orchestration
  • Claude Code, Codex CLI, Cursor, Windsurf: positioning & capabilities
  • OpenClaw architecture: agent definition, memory, context engineering, task routing

實作

Lab A6: Build a Medical AI Project with Claude Code

Use natural language + Claude Code to build from scratch: sepsis warning pipeline OR LLM-CDSS with RAG + review UI.

作業

Claude Code session log + final project + 1-page reflection: what the agent helped vs deceived

推薦論文

Anthropic, Vibe Physics (2026) -- AI as research collaborator, eager-to-please problem

B 軌:評估

講座

The Clinician as AI Director

  • You don't need to code -- you need to describe problems precisely
  • Vibe Physics case: Harvard professor uses Claude Code for theoretical physics
  • Medical scenarios: natural language -> clinical calculator prototype

實作

Workshop B6: Natural Language -> Clinical Prototype

Direct Claude Code via natural language to build a CHA2DS2-VASc calculator, drug interaction checker, or discharge summary generator.

作業

Clinical tool specification (natural language) + agent output review notes: what's right, wrong, and how you fixed it

推薦論文

Schwartz, Vibe Physics (Anthropic 2026)

C 軌:部署

講座

The AI Stack -- What Executives Must Understand

  • AI stack: foundation model -> application -> workflow -> governance layer
  • Open-source vs closed-source strategy: cost, control, compliance
  • From chatbot -> coding agent -> multi-agent: Claude Code, Codex, OpenClaw

實作

Decision Lab C6: Hospital AI Tooling Stack Workshop

Map your hospital's 4-layer AI stack: model, application, workflow, governance. Define build vs buy vs partner decisions.

作業

Hospital AI tooling stack diagram + build/buy/partner decision matrix

第 07 週

Model Evaluation, Reproducibility & Failure Modes

A 軌:建構

講座

When Good Metrics Go Bad

  • AUC trap: high AUC != clinically useful; calibration & net benefit
  • Subgroup performance: disparities across age, sex, race
  • Reproducibility crisis: why paper results fail in your hands

實作

Lab A7: Reproduce & Critique

Subgroup analysis, calibration plot, and failure mode identification on your Week 3/4 models. Compare against published results.

作業

Evaluation report: subgroup results + calibration + failure mode + improvement suggestions

推薦論文

Roberts et al., Common Pitfalls in ML for Healthcare (Nat Med 2021)

B 軌:評估

講座

Advanced Failure Modes in Medical AI

  • Leakage, shortcut learning, subgroup disparity
  • Reproducibility: why you can't replicate paper results
  • p-hacking in ML: model/metric/dataset selection degrees of freedom

實作

Workshop B7: Paper Critique #2 -- Find the Flaw

Critique 2 traditional + 2 CDSS/patient chatbot papers (Nature Medicine 2026, NEJM AI 2025). Find hidden flaws.

作業

Full critique of 2+ papers (including 1 CDSS/chatbot) + 3 major questions for authors

推薦論文

LLM Chatbot for Mental Health Treatment (NEJM AI 2025, RCT)

C 軌:部署

講座

AI ROI -- Beyond the Vendor Slide Deck

  • ROI structure: cost avoidance vs revenue vs quality vs risk reduction
  • Hidden costs: integration, training, workflow redesign, ongoing monitoring
  • Exit strategy: when to shut down an AI tool

實作

Decision Lab C7: ROI Calculator Workshop

Calculate direct/indirect ROI for a 6-month sepsis AI pilot. Run sensitivity analysis on false positive rate changes.

作業

ROI worksheet + sensitivity analysis + go/no-go recommendation

推薦論文

Kaiser Permanente AI ROI Framework

第 08 週

Multi-Agent Systems & Hospital Automation

A 軌:建構

講座

Multi-Agent Systems for Healthcare

  • Single agent vs multi-agent: when to decompose tasks
  • OpenClaw deep dive: agent definitions, memory model, context management
  • Medical multi-agent: literature review + analysis + report generation pipelines

實作

Lab A8: OpenClaw Medical AI Pipeline Demo

Interact with a pre-built 4-agent pipeline: paper intake -> method extraction -> PubMed search -> structured review report.

作業

Design your own 3-agent medical workflow: text description + agent definitions + expected I/O

推薦論文

Anthropic, Long-Running Claude for Scientific Computing (2026)

B 軌:評估

講座

Multi-Agent Systems -- What Clinicians Need to Know

  • Why one AI isn't enough: specialized agents for evidence synthesis
  • OpenClaw: methodology agent + bias agent + clinical agent collaboration
  • Human-in-the-loop: what to automate vs what requires your eyes

實作

Workshop B8: OpenClaw Evidence Synthesis Demo

Live demo of multi-agent paper review. Modify an agent prompt and observe output changes. Can this replace your journal club?

作業

Describe a multi-agent clinical workflow you want + why multiple agents + where human review is mandatory

C 軌:部署

講座

Multi-Agent AI in Hospital Operations

  • Agentic workflow scenarios: QA routing, prior auth, clinical trial matching, bed management
  • CDSS agentic pipeline: order intake -> RAG search -> LLM reasoning -> severity routing
  • Risk & governance: what can be fully automated vs human-approved

實作

Decision Lab C8: OpenClaw Hospital Automation Demo

Live demo of QA event routing pipeline. Design an agentic workflow for your hospital process. Estimate FTE replacement ROI.

作業

Agentic workflow design + cost-benefit sketch

推薦論文

Anthropic, Long-Running Claude for Scientific Computing (2026)

第 09 週

Regulation, Ethics & Safe Deployment

A 軌:建構

講座

Building Within Boundaries

  • FDA SaMD classification: 510(k) / De Novo / PMA pathways
  • EU AI Act high-risk classification; WHO 2025 guidance on health LLMs
  • Liability, model cards, datasheets for datasets, algorithmic impact assessments

實作

Lab A9: Compliance Constraint Checklist

Apply compliance checklist to your Week 6 Claude Code project: de-identification, intended use, FDA level, monitoring plan.

作業

Completed compliance checklist + revised project scope

推薦論文

FDA SaMD Framework + WHO Guidance on Health LLMs (2025)

B 軌:評估

講座

Using AI Safely in Clinical Practice

  • FDA SaMD framework: what level is your AI tool?
  • WHO 6 principles for health LLMs; EU AI Act implications
  • Liability: if AI is wrong, who is responsible -- you or the vendor?

實作

Workshop B9: Draft a Safe-Use Protocol

Write a safe-use protocol for an AI clinical note summarizer: intended use, human review nodes, incident reporting, exit criteria.

作業

Completed safe-use protocol using provided template

推薦論文

WHO Guidance on Ethics & Governance of LLMs in Health (2025)

C 軌:部署

講座

Building an AI Governance Program

  • Governance 4 pillars: policy, process, people, technology
  • Committee structure: AI governance board, clinical AI review, IT security
  • Patient chatbot governance: consent, scope limits, emergency escalation, adverse event reporting

實作

Decision Lab C9: Governance Playbook Workshop

Build a full AI governance playbook: RACI matrix, risk classification, incident response, patient chatbot governance checklist.

作業

Governance playbook + RACI matrix + patient chatbot governance checklist

推薦論文

FDA SaMD Framework + EU AI Act + WHO LLM Guidance (2025)

第 10 週

Capstone: Demo Day

A 軌:建構

講座

Capstone Presentations

  • 5-min demo: data -> model -> evaluation pipeline + Claude Code / OpenClaw process
  • Technical memo (2-3 pages): problem, data, methods, results, limitations, compliance
  • Peer review: evaluate another student's project using Track B frameworks

實作

Lab A10: Demo Day + Peer Review

Present your end-to-end medical AI project. Receive structured peer feedback across technical correctness, clinical reasoning, and agent tool usage.

作業

Final deliverables: notebook + technical memo + demo + peer review

B 軌:評估

講座

Capstone Presentations

  • AI Tool Evaluation Memo (3-4 pages): evidence quality, clinical usability, risk, recommendation
  • Trust / Use-with-caution / Reject recommendation + safe-use protocol
  • 5-min presentation to simulated hospital committee

實作

Workshop B10: Demo Day + Peer Review

Present your AI tool evaluation to a simulated hospital committee. Receive peer feedback on evidence assessment and clinical judgment.

作業

Final deliverables: evaluation memo + peer review + 5-min presentation

C 軌:部署

講座

Executive AI Strategy Simulation

  • Scenario: 500-bed hospital, $2M AI budget, 12 months to deploy 2 use cases
  • Executive strategy memo (3-5 pages): use cases, vendor eval, roadmap, governance, ROI
  • 10-min board presentation + peer challenge from other teams

實作

Decision Lab C10: Board Presentation + Peer Challenge

Present your AI strategy to a simulated board. Defend: why these use cases? What if the first one fails? What are competitors doing?

作業

Final deliverables: executive strategy memo + board presentation + peer challenge

錨點案例

三條軌從不同角度反覆探討這些臨床錨點——建立跨學科的共同語言。

CXR / 放射科 AI

從 CNN 架構到閱讀者研究、工作流整合、採購評估。

EHR / 臨床病歷摘要

從 Transformer 嵌入到幻覺風險、文件支援、供應商評估。

敗血症 / 惡化預測

從風險評分建模到閾值設定、臨床效用、部署監控。

評量方式

我們不考誰最會背 AI 術語。我們評估誰能定義問題、將模型對應任務、評估證據、判斷臨床安全。

20%
課堂參與與案例反思
20%
每週作業(依軌分層)
25%
論文批判或工具評估
35%
結業專題

準備好選擇您的軌道了嗎?

2026 春季班正在組建。三軌皆歡迎——選擇符合您背景的軌道。

立即報名