Logo
Nazad
P. G. Lacaita, Malik Galijašević, Michael Swoboda, Leonhard Gruber, Yannick Scharll, F. Barbieri, Gerlig Widmann, G. Feuchtner
0 1. 5. 2025.

The Accuracy of ChatGPT-4o in Interpreting Chest and Abdominal X-Ray Images

Background/Objectives: Large language models (LLMs), such as ChatGPT, have emerged as potential clinical support tools to enhance precision in personalized patient care, but their reliability in radiological image interpretation remains uncertain. The primary aim of our study was to evaluate the diagnostic accuracy of ChatGPT-4o in interpreting chest X-rays (CXRs) and abdominal X-rays (AXRs) by comparing its performance to expert radiology findings, whilst secondary aims were diagnostic confidence and patient safety. Methods: A total of 500 X-rays, including 257 CXR (51.4%) and 243 AXR (48.5%), were analyzed. Diagnoses made by ChatGPT-4o were compared to expert interpretations. Confidence scores (1–4) were assigned and responses were evaluated for patient safety. Results: ChatGPT-4o correctly identified 345 of 500 (69%) pathologies (95% CI: 64.81–72.9). For AXRs 175 of 243 (72.02%) pathologies were correctly diagnosed (95% CI: 66.06–77.28), while for CXRs 170 of 257 (66.15%) were accurate (95% CI: 60.16–71.66). The highest detection rates among CXRs were observed for pulmonary edema, tumor, pneumonia, pleural effusion, cardiomegaly, and emphysema, and lower rates were observed for pneumothorax, rib fractures, and enlarged mediastinum. AXR performance was highest for intestinal obstruction and foreign bodies, and weaker for pneumoperitoneum, renal calculi, and diverticulitis. Confidence scores were higher for AXRs (mean 3.45 ± 1.1) than CXRs (mean 2.48 ± 1.45). All responses (100%) were considered to be safe for the patient. Interobserver agreement was high (kappa = 0.920), and reliability (second prompt) was moderate (kappa = 0.750). Conclusions: ChatGPT-4o demonstrated moderate accuracy for the interpretation of X-rays, being higher for AXRs compared to CXRs. Improvements are required for its use as efficient clinical support tool.


Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više