摘要
Faced with challenging cases, doctors are increasingly seeking diagnostic advice from large language models (LLMs). This study aims to compare the ability of LLMs and human physicians to diagnose challenging cases. An offline dataset of 67 challenging cases with primary gastrointestinal symptoms was used to solicit possible diagnoses from seven LLMs and 22 gastroenterologists. The diagnoses by Claude 3.5 Sonnet covered the highest proportion (95% confidence interval [CI]) of instructive diagnoses (76.1%, [70.6%–80.9%]), significantly surpassing all the gastroenterologists (p < 0.05 for all). Claude 3.5 Sonnet achieved a significantly higher coverage rate (95% CI) than that of the gastroenterologists using search engines or other traditional resource (76.1% [70.6%–80.9%] vs. 45.5% [40.7%-50.4%], p < 0.001). The study highlights that advanced LLMs may assist gastroenterologists with instructive, time-saving, and cost-effective diagnostic scopes in challenging cases.
| 源语言 | 英语 |
|---|---|
| 文章编号 | 85 |
| 期刊 | npj Digital Medicine |
| 卷 | 8 |
| 期 | 1 |
| DOI | |
| 出版状态 | 已出版 - 12月 2025 |
| 已对外发布 | 是 |
学术指纹
探究 'Multiple large language models versus experienced physicians in diagnosing challenging cases with gastrointestinal symptoms' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver