scholarly journals Effect of spatial resolution and data splitting on landslide susceptibility mapping using different machine learning algorithms

2021 ◽  
Vol 12 (1) ◽  
pp. 3381-3408
Author(s):  
Minu Treesa Abraham ◽  
Neelima Satyam ◽  
Prashita Jain ◽  
Biswajeet Pradhan ◽  
Abdullah Alamri
2020 ◽  
Vol 198 ◽  
pp. 03023
Author(s):  
Xin Yang ◽  
Rui Liu ◽  
Luyao Li ◽  
Mei Yang ◽  
Yuantao Yang

Landslide susceptibility mapping is a method used to assess the probability and spatial distribution of landslide occurrences. Machine learning methods have been widely used in landslide susceptibility in recent years. In this paper, six popular machine learning algorithms namely logistic regression, multi-layer perceptron, random forests, support vector machine, Adaboost, and gradient boosted decision tree were leveraged to construct landslide susceptibility models with a total of 1365 landslide points and 14 predisposing factors. Subsequently, the landslide susceptibility maps (LSM) were generated by the trained models. LSM shows the main landslide zone is concentrated in the southeastern area of Wenchuan County. The result of ROC curve analysis shows that all models fitted the training datasets and achieved satisfactory results on validation datasets. The results of this paper reveal that machine learning methods are feasible to build robust landslide susceptibility models.


Author(s):  
B. Kalantar ◽  
N. Ueda ◽  
H. A. H. Al-Najjar ◽  
M. B. A. Gibril ◽  
U. S. Lay ◽  
...  

<p><strong>Abstract.</strong> Landslide is painstaking as one of the most prevalent and devastating forms of mass movement that affects man and his environment. The specific objective of this research paper is to investigate the application and performances of some selected machine learning algorithms (MLA) in landslide susceptibility mapping, in Dodangeh watershed, Iran. A 112 sample point of the past landslide, occurrence or inventory data was generated from the existing and field observations. In addition, fourteen landslide-conditioning parameters were derived from DEM and other topographic databases for the modelling process. These conditioning parameters include total curvature, profile curvature, plan curvature, slope, aspect, altitude, topographic wetness index (TWI), topographic roughness index (TRI), stream transport index (STI), stream power index (SPI), lithology, land use, distance to stream, distance to the fault. Meanwhile, factor analysis was employed to optimize the landslide conditioning parameters and the inventory data, by assessing the multi-collinearity effects and outlier detections respectively. The inventory data is divided into 70% (78) training dataset and 30% (34) test dataset for model validation. The receiver operating characteristics (ROC) curve or area under curve (AUC) value was used for assessing the model's performance. The findings reveal that TRI has 0.89 collinearity effect based on variance-inflated factor (VIF) and based on Gini factor optimization total curvature is not significant in the model development, therefore the two parameters are excluded from the modelling. All the selected MLAs (RF, BRT, and DT) shown promising performances on landslide susceptibility mapping in Dodangeh watershed, Iran. The ROC curve for training and validation for RF are 86% success rate and 83% prediction rate implies the best model performance compared to BRT and DT, with ROC curve of 72% and 70% prediction rate, respectively. In conclusion, RF could be the best algorithm for producing landslide susceptibility map, and such results could be adopted for the decision-making process to support land use planner for improving landslide risk assessment in similar environmental settings.</p>


Land ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 989
Author(s):  
Minu Treesa Abraham ◽  
Neelima Satyam ◽  
Revuri Lokesh ◽  
Biswajeet Pradhan ◽  
Abdullah Alamri

Data driven methods are widely used for the development of Landslide Susceptibility Mapping (LSM). The results of these methods are sensitive to different factors, such as the quality of input data, choice of algorithm, sampling strategies, and data splitting ratios. In this study, five different Machine Learning (ML) algorithms are used for LSM for the Wayanad district in Kerala, India, using two different sampling strategies and nine different train to test ratios in cross validation. The results show that Random Forest (RF), K Nearest Neighbors (KNN), and Support Vector Machine (SVM) algorithms provide better results than Naïve Bayes (NB) and Logistic Regression (LR) for the study area. NB and LR algorithms are less sensitive to the sampling strategy and data splitting, while the performance of the other three algorithms is considerably influenced by the sampling strategy. From the results, both the choice of algorithm and sampling strategy are critical in obtaining the best suited landslide susceptibility map for a region. The accuracies of KNN, RF, and SVM algorithms have increased by 10.51%, 10.02%, and 4.98% with the use of polygon landslide inventory data, while for NB and LR algorithms, the performance was slightly reduced with the use of polygon data. Thus, the sampling strategy and data splitting ratio are less consequential with NB and algorithms, while more data points provide better results for KNN, RF, and SVM algorithms.


2020 ◽  
Author(s):  
Naeem Shahzad ◽  
Xiaoli Ding ◽  
Sawaid Abbas

&lt;p&gt;Machine learning has proven most effective in mapping landslide susceptibility. We carry out experiments with two machine learning algorithms, SVM and MaxENT to study their effectiveness for some mountaneous areas in Pakistan. A data set of 112 historic landslides are used in the study with 70% of the landslides are used for training and the rest for validation. 15 landslide casuative factors are used initially and ineffective ones are eliminated based on information Gain Ratio and Multicollinearity test techniques.&amp;#160; The perfromances of the landslides susceptibility maps generated are assessed using receiver operating curves (ROC), confusion matrix (CM) (Kappa, root mean square error, mean absolute error and balanced accuracy), landslide density (LD), R-index and Pearson&amp;#8217;s Chi-squared tests. The result show that both of the models work well in this area. However, the lowest significant value &amp;#8216;p&amp;#8217; (&lt;0.05) during Chi-square test, showed that both the landslide models have statistical significant difference.&lt;/p&gt;


2021 ◽  
Vol 10 (10) ◽  
pp. 639
Author(s):  
Han Hu ◽  
Changming Wang ◽  
Zhu Liang ◽  
Ruiyuan Gao ◽  
Bailong Li

Landslides frequently occur because of natural or human factors. Landslides cause huge losses to the economy as well as human beings every year around the globe. Landslide susceptibility prediction (LSP) plays a key role in the prevention of landslides and has been under investigation for years. Although new machine learning algorithms have achieved excellent performance in terms of prediction accuracy, a sufficient quantity of training samples is essential. In contrast, it is hard to obtain enough landslide samples in most the areas, especially for the county-level area. The present study aims to explore an optimization model in conjunction with conventional unsupervised and supervised learning methods, which performs well with respect to prediction accuracy and comprehensibility. Logistic regression (LR), fuzzy c-means clustering (FCM) and factor analysis (FA) were combined to establish four models: LR model, FCM coupled with LR model, FA coupled with LR model, and FCM, FA coupled with LR model and applied in a specific area. Firstly, an inventory with 114 landslides and 10 conditioning factors was prepared for modeling. Subsequently, four models were applied to LSP. Finally, the performance was evaluated and compared by k-fold cross-validation based on statistical measures. The results showed that the coupled model by FCM, FA and LR achieved the greatest performance among these models with the AUC (Area under the curve) value of 0.827, accuracy of 85.25%, sensitivity of 74.96% and specificity of 86.21%. While the LR model performed the worst with an AUC value of 0.736, accuracy of 77%, sensitivity of 62.52% and specificity of 72.55%. It was concluded that both the dimension reduction and sample size should be considered in modeling, and the performance can be enhanced by combining complementary methods. The combination of models should be more flexible and purposeful. This work provides reference for related research and better guidance to engineering activities, decision-making by local administrations and land use planning.


Sign in / Sign up

Export Citation Format

Share Document