跳到主要导航 跳到搜索 跳到主要内容

Multiple large language models versus experienced physicians in diagnosing challenging cases with gastrointestinal symptoms

  • Xintian Yang
  • , Tongxin Li
  • , Han Wang
  • , Rongchun Zhang
  • , Zhi Ni
  • , Na Liu
  • , Huihong Zhai
  • , Jianghai Zhao
  • , Fandong Meng
  • , Zhongyin Zhou
  • , Shanhong Tang
  • , Limei Wang
  • , Xiangping Wang
  • , Hui Luo
  • , Gui Ren
  • , Linhui Zhang
  • , Xiaoyu Kang
  • , Jun Wang
  • , Ning Bo
  • , Xiaoning Yang
  • Weijie Xue, Xiaoyin Zhang, Ning Chen, Rui Guo, Baiwen Li, Yajun Li, Yaling Liu, Tiantian Zhang, Shuhui Liang, Yong Lv, Yongzhan Nie, Daiming Fan, Lina Zhao, Yanglin Pan
  • Xijing Hospital
  • Huazhong University of Science and Technology
  • Fujian Medical University
  • Hainan Medical University
  • Capital Medical University
  • Henan University
  • Renmin Hospital of Wuhan University
  • The General Hospital of Western Theater Command
  • Shaanxi Second People’s Hospital
  • Air Force Medical University
  • The Second Affiliated Hospital of Chongqing Medical University
  • Zhongshan Hospital Affiliated to Xiamen University
  • Kumamoto University
  • Shenzhen Third People's Hospital
  • Peking University
  • Shanghai Jiao Tong University
  • Ningxia Medical University

科研成果: 期刊稿件文章同行评审

16 引用 (Scopus)

摘要

Faced with challenging cases, doctors are increasingly seeking diagnostic advice from large language models (LLMs). This study aims to compare the ability of LLMs and human physicians to diagnose challenging cases. An offline dataset of 67 challenging cases with primary gastrointestinal symptoms was used to solicit possible diagnoses from seven LLMs and 22 gastroenterologists. The diagnoses by Claude 3.5 Sonnet covered the highest proportion (95% confidence interval [CI]) of instructive diagnoses (76.1%, [70.6%–80.9%]), significantly surpassing all the gastroenterologists (p < 0.05 for all). Claude 3.5 Sonnet achieved a significantly higher coverage rate (95% CI) than that of the gastroenterologists using search engines or other traditional resource (76.1% [70.6%–80.9%] vs. 45.5% [40.7%-50.4%], p < 0.001). The study highlights that advanced LLMs may assist gastroenterologists with instructive, time-saving, and cost-effective diagnostic scopes in challenging cases.

源语言英语
文章编号85
期刊npj Digital Medicine
8
1
DOI
出版状态已出版 - 12月 2025
已对外发布

学术指纹

探究 'Multiple large language models versus experienced physicians in diagnosing challenging cases with gastrointestinal symptoms' 的科研主题。它们共同构成独一无二的指纹。

引用此