When Enough Is Really Enough? On the Minimum Number of Landslides to Build Reliable Susceptibility Models
Mapping existing landslides is a fundamental prerequisite to build any reliable susceptibility model. From a series of landslide presence/absence conditions and associated landscape characteristics, a binary classifier learns how to distinguish potentially stable and unstable slopes. In data rich areas where landslide inventories are available, addressing the collection of these can already be a challenging task. However, in data scarce contexts, where geoscientists do not get access to pre-existing inventories, the only solution is to map landslides from scratch. This operation can be extremely time-consuming if manually performed or prone to type I errors if done automatically. This is even more exacerbated if done over large geographic regions. In this manuscript we examine the issue of mapping requirements for west Tajikistan where no complete landslide inventory is available. The key question is: How many landslides should be required to develop reliable landslide susceptibility models based on statistical modeling? In fact, for such a wide and extremely complex territory, the collection of an inventory that is sufficiently detailed requires a large investment in time and human resources. However, at which point of the mapping procedure, would the resulting susceptibility model produce significantly better results as compared to a model built with less information? We addressed this question by implementing a binomial Generalized Additive Model trained and validated with different landslide proportions and measured the induced variability in the resulting susceptibility model. The results of this study are very site-specific but we proposed a very functional protocol to investigate a problem which is underestimated in the literature.