A Novel Integrated Type 2 Diabetes Prediction Model for Indian Population using Data Mining Techniques
Late diagnosis and undiagnosed type 2 diabetes are the two major concerns for India, which is going to be a diabetes capital shortly. Several diabetes risk score (DRS) tools have been proposed and deployed for detecting the persons with high risk. These DRS tools have been developed using the multiple logistic regression model. But this model is both imperfect and subject to misuse. Another major issue with the DRS tools developed for Indian population is that they are based on the very limited urban population that does not represent the population of India. The objective of current research work is to develop a classification model for type 2 diabetes prediction. Along with this, the building of a novel integrated model for type 2 diabetes risk prediction is discussed consisting of the aggregate classification model and Indian weighted diabetes risk score model. The dataset used to develop and validate the model is obtained from the Annual Health Survey comprising of nearly 0.7 million and nearly 75 thousand adult participants respectively from around 400 districts of India. The proposed integrated diabetes risk prediction model predicts diabetes with 69.89% sensitivity, 56.58% specificity. The positive predictive value of the proposed integrated model is 15.88%, which is a significant improvement as the prevalence of diabetes is only 3.68% for the study population. Developing countries such as India, where undiagnosed diabetes and limited financial resources are a significant concern, the proposed integrated model for diabetes risk prediction can be useful as a cheaper tool useful for mass-screening, which can save up to 30% of the total screening cost.