scholarly journals PERFORMANCE OF MACHINE LEARNING ALGORITHMS FOR MAPPING AND FORECASTING OF FLASH FLOOD SUSCEPTIBILITY IN TETOUAN, MOROCCO

Author(s):  
E. M. Sellami ◽  
M. Maanan ◽  
H. Rhinane

Abstract. Since the industrial revolution, the world is experiencing a huge change in its climate, which causes many imbalances such as flash floods (FF). The aim of this study is to propose a new approach for detection and forecasting of flash flood susceptibility in the city of Tetouan, Morocco. For this regard, support vector machine (SVM), logistic regression (LR), random forest (RF), Naïve Bayes (NB) and Artificial neural network (ANN) are used based on 1101 points (680 flood points and 421 non-flood points) and 9 flash-flood predictors (Elevation , Slope , Aspect , LU/LC , Stream Power Index , Plan curvature , Profile Curvature , Topographic Position Index and Topographic Wetness Index ) that were extracted from the DEM (10m resolution) and satellite imagery (Sentinel 2B) of the study area . Models were trained on 70% and tested on 30% of this dataset also they were evaluated using several metrics such as the Receiver Operating Characteristic (ROC) Curve, precision, recall, score and kappa index. The result demonstrated that RF (AUC = 0.99, Accuracy = 96%, Kappa statistics = 0.92) has the highest performance, followed by ANN (AUC = 0.98, Accuracy = 95%, Kappa statistics = 0.89) and SVM (AUC = 0.96, Accuracy = 92%, Kappa statistics = 0.80). The proposed approach is an effective tool for forecasting and predicting FF that can help reduce the severity of this disaster.

2020 ◽  
Vol 12 (17) ◽  
pp. 2688 ◽  
Author(s):  
Viet-Ha Nhu ◽  
Phuong-Thao Thi Ngo ◽  
Tien Dat Pham ◽  
Jie Dou ◽  
Xuan Song ◽  
...  

Flash flood is one of the most dangerous natural phenomena because of its high magnitudes and sudden occurrence, resulting in huge damages for people and properties. Our work aims to propose a state-of-the-art model for susceptibility mapping of the flash flood using the decision tree random subspace ensemble optimized by hybrid firefly–particle swarm optimization (HFPS), namely the HFPS-RSTree model. In this work, we used data from a flood inventory map consisting of 1866 polygons derived from Sentinel-1 C-band synthetic aperture radar (SAR) data and a field survey conducted in the northwest mountainous area of the Van Ban district, Lao Cai Province in Vietnam. A total of eleven flooding conditioning factors (soil type, geology, rainfall, river density, elevation, slope, aspect, topographic wetness index (TWI), normalized difference vegetation index (NDVI), plant curvature, and profile curvature) were used as explanatory variables. These indicators were compiled from a geological and mineral resources map, soil type map, and topographic map, ALOS PALSAR DEM 30 m, and Landsat-8 imagery. The HFPS-RSTree model was trained and verified using the inventory map and the eleven conditioning variables and then compared with four machine learning algorithms, i.e., the support vector machine (SVM), the random forests (RF), the C4.5 decision trees (C4.5 DT), and the logistic model trees (LMT) models. We employed a range of statistical standard metrics to assess the predictive performance of the proposed model. The results show that the HFPS-RSTree model had the best predictive performance and achieved better results than those of other benchmarks with the ability to predict flash flood, reaching an overall accuracy of over 90%. It can be concluded that the proposed approach provides new insights into flash flood prediction in mountainous regions.


Sensors ◽  
2019 ◽  
Vol 19 (22) ◽  
pp. 4893 ◽  
Author(s):  
Hejar Shahabi ◽  
Ben Jarihani ◽  
Sepideh Tavakkoli Piralilou ◽  
David Chittleborough ◽  
Mohammadtaghi Avand ◽  
...  

Gully erosion is a dominant source of sediment and particulates to the Great Barrier Reef (GBR) World Heritage area. We selected the Bowen catchment, a tributary of the Burdekin Basin, as our area of study; the region is associated with a high density of gully networks. We aimed to use a semi-automated object-based gully networks detection process using a combination of multi-source and multi-scale remote sensing and ground-based data. An advanced approach was employed by integrating geographic object-based image analysis (GEOBIA) with current machine learning (ML) models. These included artificial neural networks (ANN), support vector machines (SVM), and random forests (RF), and an ensemble ML model of stacking to deal with the spatial scaling problem in gully networks detection. Spectral indices such as the normalized difference vegetation index (NDVI) and topographic conditioning factors, such as elevation, slope, aspect, topographic wetness index (TWI), slope length (SL), and curvature, were generated from Sentinel 2A images and the ALOS 12-m digital elevation model (DEM), respectively. For image segmentation, the ESP2 tool was used to obtain three optimal scale factors. On using object pureness index (OPI), object matching index (OMI), and object fitness index (OFI), the accuracy of each scale in image segmentation was evaluated. The scale parameter of 45 with OFI of 0.94, which is a combination of OPI and OMI indices, proved to be the optimal scale parameter for image segmentation. Furthermore, segmented objects based on scale 45 were overlaid with 70% and 30% of a prepared gully inventory map to select the ML models’ training and testing objects, respectively. The quantitative accuracy assessment methods of Precision, Recall, and an F1 measure were used to evaluate the model’s performance. Integration of GEOBIA with the stacking model using a scale of 45 resulted in the highest accuracy in detection of gully networks with an F1 measure value of 0.89. Here, we conclude that the adoption of optimal scale object definition in the GEOBIA and application of the ensemble stacking of ML models resulted in higher accuracy in the detection of gully networks.


Sensors ◽  
2019 ◽  
Vol 19 (16) ◽  
pp. 3451 ◽  
Author(s):  
Usman Salihu Lay ◽  
Biswajeet Pradhan ◽  
Zainuddin Bin Md Yusoff ◽  
Ahmad Fikri Bin Abdallah ◽  
Jagannath Aryal ◽  
...  

Cameron Highland is a popular tourist hub in the mountainous area of Peninsular Malaysia. Most communities in this area suffer frequent incidence of debris flow, especially during monsoon seasons. Despite the loss of lives and properties recorded annually from debris flow, most studies in the region concentrate on landslides and flood susceptibilities. In this study, debris-flow susceptibility prediction was carried out using two data mining techniques; Multivariate Adaptive Regression Splines (MARS) and Support Vector Regression (SVR) models. The existing inventory of debris-flow events (640 points) were selected for training 70% (448) and validation 30% (192). Twelve conditioning factors namely; elevation, plan-curvature, slope angle, total curvature, slope aspect, Stream Transport Index (STI), profile curvature, roughness index, Stream Catchment Area (SCA), Stream Power Index (SPI), Topographic Wetness Index (TWI) and Topographic Position Index (TPI) were selected from Light Detection and Ranging (LiDAR)-derived Digital Elevation Model (DEM) data. Multi-collinearity was checked using Information Factor, Cramer’s V, and Gini Index to identify the relative importance of conditioning factors. The susceptibility models were produced and categorized into five classes; not-susceptible, low, moderate, high and very-high classes. Models performances were evaluated using success and prediction rates where the area under the curve (AUC) showed a higher performance of MARS (93% and 83%) over SVR (76% and 72%). The result of this study will be important in contingency hazards and risks management plans to reduce the loss of lives and properties in the area.


2020 ◽  
Vol 12 (21) ◽  
pp. 3568
Author(s):  
Shahab S. Band ◽  
Saeid Janizadeh ◽  
Subodh Chandra Pal ◽  
Asish Saha ◽  
Rabin Chakrabortty ◽  
...  

Flash flooding is considered one of the most dynamic natural disasters for which measures need to be taken to minimize economic damages, adverse effects, and consequences by mapping flood susceptibility. Identifying areas prone to flash flooding is a crucial step in flash flood hazard management. In the present study, the Kalvan watershed in Markazi Province, Iran, was chosen to evaluate the flash flood susceptibility modeling. Thus, to detect flash flood-prone zones in this study area, five machine learning (ML) algorithms were tested. These included boosted regression tree (BRT), random forest (RF), parallel random forest (PRF), regularized random forest (RRF), and extremely randomized trees (ERT). Fifteen climatic and geo-environmental variables were used as inputs of the flash flood susceptibility models. The results showed that ERT was the most optimal model with an area under curve (AUC) value of 0.82. The rest of the models’ AUC values, i.e., RRF, PRF, RF, and BRT, were 0.80, 0.79, 0.78, and 0.75, respectively. In the ERT model, the areal coverage for very high to moderate flash flood susceptible area was 582.56 km2 (28.33%), and the rest of the portion was associated with very low to low susceptibility zones. It is concluded that topographical and hydrological parameters, e.g., altitude, slope, rainfall, and the river’s distance, were the most effective parameters. The results of this study will play a vital role in the planning and implementation of flood mitigation strategies in the region.


2019 ◽  
Vol 11 (19) ◽  
pp. 5426 ◽  
Author(s):  
Saeid Janizadeh ◽  
Mohammadtaghi Avand ◽  
Abolfazl Jaafari ◽  
Tran Van Phong ◽  
Mahmoud Bayat ◽  
...  

Floods are some of the most destructive and catastrophic disasters worldwide. Development of management plans needs a deep understanding of the likelihood and magnitude of future flood events. The purpose of this research was to estimate flash flood susceptibility in the Tafresh watershed, Iran, using five machine learning methods, i.e., alternating decision tree (ADT), functional tree (FT), kernel logistic regression (KLR), multilayer perceptron (MLP), and quadratic discriminant analysis (QDA). A geospatial database including 320 historical flood events was constructed and eight geo-environmental variables—elevation, slope, slope aspect, distance from rivers, average annual rainfall, land use, soil type, and lithology—were used as flood influencing factors. Based on a variety of performance metrics, it is revealed that the ADT method was dominant over the other methods. The FT method was ranked as the second-best method, followed by the KLR, MLP, and QDA. Given a few differences between the goodness-of-fit and prediction success of the methods, we concluded that all these five machine-learning-based models are applicable for flood susceptibility mapping in other areas to protect societies from devastating floods.


Sensors ◽  
2019 ◽  
Vol 19 (16) ◽  
pp. 3590 ◽  
Author(s):  
Bui ◽  
Moayedi ◽  
Kalantar ◽  
Osouli ◽  
Gör ◽  
...  

In this research, the novel metaheuristic algorithm Harris hawks optimization (HHO) is applied to landslide susceptibility analysis in Western Iran. To this end, the HHO is synthesized with an artificial neural network (ANN) to optimize its performance. A spatial database comprising 208 historical landslides, as well as 14 landslide conditioning factors—elevation, slope aspect, plan curvature, profile curvature, soil type, lithology, distance to the river, distance to the road, distance to the fault, land cover, slope degree, stream power index (SPI), topographic wetness index (TWI), and rainfall—is prepared to develop the ANN and HHO–ANN predictive tools. Mean square error and mean absolute error criteria are defined to measure the performance error of the models, and area under the receiving operating characteristic curve (AUROC) is used to evaluate the accuracy of the generated susceptibility maps. The findings showed that the HHO algorithm effectively improved the performance of ANN in both recognizing (AUROCANN = 0.731 and AUROCHHO–ANN = 0.777) and predicting (AUROCANN = 0.720 and AUROCHHO–ANN = 0.773) the landslide pattern.


2021 ◽  
Vol 14 (1) ◽  
pp. 439
Author(s):  
Gadisa Fayera Gemechu ◽  
Xiaoping Rui ◽  
Haiyue Lu

Wetlands are a distinctive terrestrial ecosystem that benefits living things, including people, in various ways. Sustainable wetland ecosystem resources are needed to protect the global environment. Wetlands in China have undergone positive and negative changes in response to several factors, but studies documenting their long-term dynamicity have been few, particularly in Guangling County. This study examines the change of wetlands area based on remotely sensed data while exploring trends associated with climate variations and economic growth in Guangling County, China. Analysis of remotely sensed imagery, mainly in hilly and nonhomogeneous environments is problematic, largely as a result of interference and their high spectral non-homogeneity. We conducted experiments using five classical machine learning algorithms based on the Google Earth Engine (GEE) and obtained the greatest robustness and accuracy using a Support Vector Machine (SVM)—Radial Basis Function (RBF) kernel approach, with overall accuracy and kappa statistics ranging from 86% to 98.1% and from 0.789 to 0.960, respectively. Based on the SVM-RBF model’s outperformance of four other algorithms, we identified spatial distributions of wetland in the study area and associated change trends. We found that 45.71 km2 of wetland area was lost over the past 3.7 decades (January 1984–December 2020), or 81.82% of wetland area coverage. In this paper, we explore how factors such as county economic growth (GDP), humidity, and temperature variations are tightly linked with wetland change.


2021 ◽  
Author(s):  
Hossein Hamedi Sorajar ◽  
Ali Asghar Alesheikh ◽  
Mahdi Panahi ◽  
Saro Lee

Abstract Landslides are one of the most destructive natural phenomena in the world, which occur mostly in mountainous areas and cause damage to the economic sectors, agricultural lands, residential areas and infrastructures of any country, and also threaten the lives and property of human beings. Therefore, landslide susceptibility mapping (LSM) can play a critical role in identifying prone areas and reducing the damage caused by landslides in each area. In the present study, deep learning algorithms including convolutional neural network (CNN) and long short-term memory (LSTM) were used to identify landslide prone areas in Ardabil province, Iran. Equql to 312 landslide locations were identified and randomly divided into train and test datasets at 70–30% ratios. Then, according to previous studies and environmental conditions in the study area, twelve factors affecting the occurrence of landslides were selected, namely altitude, slope angle, slope aspect, topographic wetness index (TWI), profile curvature, plan curvature, land-use, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. The ratio of the importance of each influential factor in landslide occurrence was obtained through information gain ranking filter (IGRF) method and it was found that land-use and profile curvature had the highest and lowest impacts, respectively. Afterwards, LSMs were generated using CNN and LSTM algorithms. In the next step, the performance of the models was evaluated based on the area under curve (AUC) value of receiver operating characteristics curve and the root mean square error (RMSE) method. The AUC values for CNN and LSTM models were 0.821 and 0.832, respectively. Furthermore, the RMSE values in the CNN model for each of the training and testing dataset were 0.121 and 0.132, respectively. The RMSE values in the LSTM model for each of the training and testing dataset were 0.185 and 0.188, respectively. Therefore, it can be concluded that CNN performance is slightly better than LSTM; but in general, both models have close performance and the accuracy of both models is acceptable.


2021 ◽  
Author(s):  
Tingyu Zhang ◽  
Quan Fu ◽  
Hao Wang ◽  
Fangfang Liu ◽  
Huanyuan Wang ◽  
...  

Abstract Landslide hazards have attracted increasing public attention over the past decades due to a series of catastrophic consequences of landslide occurrence. Thus, the mitigation and prevention of landslide hazards have been the topical issues. Thereinto, numerous research achievements on landslide susceptibility assessment have been springing up in recent years. In this paper, four benchmark models including best-first decision tree (BFTree), functional tree (FT), support vector machine (SVM) and classification regression tree (CART) and were integrated with bagging strategy. Then these bagging-based models were applied to map regional landslide susceptibility in Jiange County, Sichuan Province, China. Fifteen conditioning factors were employed in establishing landslide susceptibility models, respectively, slope aspect, slope angle, elevation, plan curvature, profile curvature, TWI, SPI, STI, lithology, soil, land use, NDVI, distance to rivers, distance to roads and distance to lineaments. Then utilize correlation attribute evaluation (CAE) method to weigh the contribution of each factor. Finally, the comprehensive performance of various bagging-based models and corresponding benchmark models was evaluated and systematically compared applying receiver operating characteristic curve (ROC) and area under curve (AUC) values. Results demonstrated that bagging-based ensemble models significantly outperformed their corresponding benchmark models with validation dataset. among them the Bag-CART model has the highest AUC value of 0.874, however the AUC value of CART model is only 0.766, which reflected satisfying predictive capacity of integrated models in some degree. The achievements obtained in this study have some reference values for landslides prevention and land resource planning in Jiange County.


2020 ◽  
Vol 12 (21) ◽  
pp. 3609
Author(s):  
Xinchuan Li ◽  
Juhua Luo ◽  
Xiuliang Jin ◽  
Qiaoning He ◽  
Yun Niu

Spatially continuous soil thickness data at large scales are usually not readily available and are often difficult and expensive to acquire. Various machine learning algorithms have become very popular in digital soil mapping to predict and map the spatial distribution of soil properties. Identifying the controlling environmental variables of soil thickness and selecting suitable machine learning algorithms are vitally important in modeling. In this study, 11 quantitative and four qualitative environmental variables were selected to explore the main variables that affect soil thickness. Four commonly used machine learning algorithms (multiple linear regression (MLR), support vector regression (SVR), random forest (RF), and extreme gradient boosting (XGBoost) were evaluated as individual models to separately predict and obtain a soil thickness distribution map in Henan Province, China. In addition, the two stacking ensemble models using least absolute shrinkage and selection operator (LASSO) and generalized boosted regression model (GBM) were tested and applied to build the most reliable and accurate estimation model. The results showed that variable selection was a very important part of soil thickness modeling. Topographic wetness index (TWI), slope, elevation, land use and enhanced vegetation index (EVI) were the most influential environmental variables in soil thickness modeling. Comparative results showed that the XGBoost model outperformed the MLR, RF and SVR models. Importantly, the two stacking models achieved higher performance than the single model, especially when using GBM. In terms of accuracy, the proposed stacking method explained 64.0% of the variation for soil thickness. The results of our study provide useful alternative approaches for mapping soil thickness, with potential for use with other soil properties.


Sign in / Sign up

Export Citation Format

Share Document