About BetterHealthBench
BetterHealthBench is an independent evaluation platform for healthcare AI. We run multi-turn clinical scenarios under fixed scaffolding, score with both deterministic metrics and an LLM jury, and track model performance over time so that health systems and procurement teams can make evidence-based decisions.
Why we exist
Most healthcare AI benchmarks measure single-turn factual recall on closed-book exams. That's not how clinical AI is used in practice — patients don't hand over structured vignettes, and safety failures appear in the dialogue, not the final answer. We built BetterHealthBench to close that gap: adversarial safety testing, longitudinal drift tracking, and methodology transparency are first-class, not footnotes.
Independence
We are not affiliated with, endorsed by, or funded by any frontier model provider. Benchmark results are not influenced by commercial relationships with evaluated vendors. See our methodology for how we keep evaluations fair and reproducible.
Who we are
Founded by a practicing clinician with deep evaluation-methodology expertise. We work with academic collaborators, health-system AI governance committees, and clinician reviewers.
Contact
For partnerships, benchmark submissions, or press, see contact.