Deep Learning Classification of Active Tuberculosis Using Chest X-Rays: Efficacy of Transfer Learning and Generalization Performance of Cross-Population Datasets
Abstract Chest X-ray based diagnosis of active Tuberculosis (TB) is one of the oldest ubiquitous tests in medical practice. Artificial Intelligence (AI) based automated detection of abnormality in chest radiography is crucial in radiology workflow. Most deep convolutional neural networks (DCNN) for diagnosing TB by transfer learning from natural images and using the same dataset to evaluate the model performance and diagnostic accuracy. However, dataset shift is a known issue in predictive models in AI, which is unexplored. In this work, we fine-tuned, validated, and tested two benchmark architectures and utilized the transfer learning methodology to measure the diagnostic accuracy on cross-population datasets. We achieved remarkable calcification accuracy of 100% and area under the receiver operating characteristic (AUC) 1.000 [1.000 – 1.000] (with a sensitivity 0.985 [0.971 – 1.000] and a specificity of 0.986 [0.971 – 1.000]) on intramural test set, but significant drop in extramural test set. Accuracy on various extramural test sets varies 50% - 70%, AUC ranges 0.527 – 0.865 (sensitivity and specificity fluctuate 0.394 – 0.995 and 0.443 – 0.864 respectively). Diagnostic performance on the intramural test set observed in this study shows that DCNN can accurately classify active TB and normal chest radiographs, however the external test set shows DCNN is less likely to generalize well on models trained on specific population dataset.