한빛사논문
Sun Yeop Lee 1,11, Sangwoo Ha1,11, Min Gyeong Jeon1, Hao Li 1, Hyunju Choi1, Hwa Pyung Kim1, Ye Ra Choi2,3, Hoseok I4,5, Yeon Joo Jeong6, Yoon Ha Park7
, Hyemin Ahn8, Sang Hyup Hong8, Hyun Jung Koo8, Choong Wook Lee8, Min Jae Kim9, Yeon Joo Kim10, Kyung Won Kim8 and Jong Mun Choi 1
1Department of Medical Artificial Intelligence, Deepnoid, Inc., Seoul, Republic of Korea.
2Department of Radiology, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, Seoul, Republic of Korea.
3Department of Radiology, Seoul National University College of Medicine, Seoul, Republic of Korea.
4Department of Thoracic and Cardiovascular Surgery, Pusan National University School of Medicine, Busan, Republic of Korea.
5Convergence Medical Institute of Technology, Biomedical Research Institute, Pusan National University Hospital, Busan, Republic of Korea.
6Department of Radiology and Biomedical Research Institute, Pusan National University Hospital, Busan, Republic of Korea.
7Department of Internal Medicine, Jawol Health Center, Incheon, Republic of Korea.
8Department of Radiology and Research Institute of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.
9Department of Infectious Disease, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.
10Department of Respiratory Allergy Medicine, Nowon Eulji Medical Center, Seoul, Republic of Korea.
11These authors contributed equally: Sun Yeop Lee, Sangwoo Ha.
Corresponding author: Correspondence to Jong Mun Choi.
Abstract
While many deep-learning-based computer-aided detection systems (CAD) have been developed and commercialized for abnormality detection in chest radiographs (CXR), their ability to localize a target abnormality is rarely reported. Localization accuracy is important in terms of model interpretability, which is crucial in clinical settings. Moreover, diagnostic performances are likely to vary depending on thresholds which define an accurate localization. In a multi-center, stand-alone clinical trial using temporal and external validation datasets of 1,050 CXRs, we evaluated localization accuracy, localization-adjusted discrimination, and calibration of a commercially available deep-learning-based CAD for detecting consolidation and pneumothorax. The CAD achieved image-level AUROC (95% CI) of 0.960 (0.945, 0.975), sensitivity of 0.933 (0.899, 0.959), specificity of 0.948 (0.930, 0.963), dice of 0.691 (0.664, 0.718), moderate calibration for consolidation, and image-level AUROC of 0.978 (0.965, 0.991), sensitivity of 0.956 (0.923, 0.978), specificity of 0.996 (0.989, 0.999), dice of 0.798 (0.770, 0.826), moderate calibration for pneumothorax. Diagnostic performances varied substantially when localization accuracy was accounted for but remained high at the minimum threshold of clinical relevance. In a separate trial for diagnostic impact using 461 CXRs, the causal effect of the CAD assistance on clinicians' diagnostic performances was estimated. After adjusting for age, sex, dataset, and abnormality type, the CAD improved clinicians' diagnostic performances on average (OR [95% CI] = 1.73 [1.30, 2.32]; p < 0.001), although the effects varied substantially by clinical backgrounds. The CAD was found to have high stand-alone diagnostic performances and may beneficially impact clinicians' diagnostic performances when used in clinical settings.
논문정보
관련 링크
연구자 키워드
연구자 ID
관련분야 연구자보기
소속기관 논문보기
관련분야 논문보기