Identification of Significant Climatic Risk Factors and Machine Learning Models in Dengue Outbreak Prediction
Abstract Background: Dengue fever is a widespread viral disease and one of the world’s main pandemic vector-borne infections and serious hazard to humanity. According to the World Health Organization (WHO), the incidence of dengue has grown dramatically worldwide in recent decades. The WHO currently estimates an annual incidence of 50–100 million dengue infections worldwide. Until today there is no tested vaccine or treatment to stop or prevent dengue fever thus the importance of dengue outbreak prediction is significant. The current issue in dengue outbreak prediction is accuracy. There are a limited number of studies that look at in depth analysis of climate factors in dengue outbreak prediction. Methods: In this study, the most significant and important climatic factors that contribute to dengue outbreak were identified. These factors were used as input parameters on machine learning models. The models were trained and evaluated based on four-year data from January 2010 to December 2013 in Malaysia. Results: This work provides two main contributions. A new risk factor, which was called TempeRain Factor (TRF), was determined and used as an input parameter for dengue prediction outbreak model. Moreover, the TRF was applied to demonstrate that its strong impact on dengue outbreaks. Experimental results showed that Support Vector Machine (SVM) with the newly identified meteorological risk factor in this study resulted in higher accuracy of 98.09% and reduced the root mean square error to 0.098 for predicting dengue outbreak. Conclusions: This research managed to explore on the factors that are being used in dengue outbreak prediction systems. The main contribution of this paper is in identifying new significant factors that contribute in dengue outbreak prediction. From the evaluation, we managed to obtain a significant improvement in accuracy of the machine-learning model in dengue outbreak prediction.