block
            CV
            : An
            r
            package for generating spatially or environmentally separated folds for
            k
            ‐fold cross‐validation of species distribution models

SummaryWhen applied to structured data, conventional random cross-validation techniques can lead to underestimation of prediction error, and may result in inappropriate model selection.We present the R package blockCV, a new toolbox for cross-validation of species distribution modelling.The package can generate spatially or environmentally separated folds. It includes tools to measure spatial autocorrelation ranges in candidate covariates, providing the user with insights into the spatial structure in these data. It also offers interactive graphical capabilities for creating spatial blocks and exploring data folds.Package blockCV enables modellers to more easily implement a range of evaluation approaches. It will help the modelling community learn more about the impacts of evaluation approaches on our understanding of predictive performance of species distribution models.

Download Full-text

sdmbench: R package for benchmarking species distribution models

The Journal of Open Source Software ◽

10.21105/joss.00847 ◽

2018 ◽

Vol 3 (29) ◽

pp. 847 ◽

Cited By ~ 2

Author(s):

Boyan Angelov

Keyword(s):

Species Distribution ◽

Species Distribution Models ◽

R Package ◽

Distribution Models

Download Full-text

phyr: An R package for phylogenetic species-distribution modelling in ecological communities

10.1101/2020.02.17.952317 ◽

2020 ◽

Author(s):

Daijiang Li ◽

Russell Dinnage ◽

Lucas Nell ◽

Matthew R. Helmus ◽

Anthony Ives

Keyword(s):

Community Composition ◽

Species Distribution ◽

Species Distribution Models ◽

R Package ◽

Bipartite Network ◽

List Type ◽

Ecological Communities ◽

Phylogenetic Species ◽

Distribution Models ◽

Model Based

SummaryModel-based approaches are increasingly popular in ecological studies. A good example of this trend is the use of joint species distribution models to ask questions about ecological communities. However, most current applications of model-based methods do not include phylogenies despite the well-known importance of phylogenetic relationships in shaping species distributions and community composition. In part, this is due to lack of accessible tools allowing ecologists to fit phylogenetic species distribution models easily.To fill this gap, the R package phyr (pronounced fire) implements a suite of metrics, comparative methods and mixed models that use phylogenies to understand and predict community composition and other ecological and evolutionary phenomena. The phyr workhorse functions are implemented in C++ making all calculations and model estimations fast.phyr can fit a variety of models such as phylogenetic joint-species distribution models, spatiotemporal-phylogenetic autocorrelation models, and phylogenetic trait-based bipartite network models. phyr also estimates phylogenetically independent trait correlations with measurement error to test for adaptive syndromes and performs fast calculations of common alpha and beta phylogenetic diversity metrics. All phyr methods are united under Brownian motion or Ornstein-Uhlenbeck models of evolution and phylogenetic terms are modelled as phylogenetic covariance matrices.The functions and model formula syntax we propose in phyr serves as a simple and unified framework that ignites the use of phylogenies to address a variety of ecological questions.

Download Full-text

ssdm : An r package to predict distribution of species richness and composition based on stacked species distribution models

Methods in Ecology and Evolution ◽

10.1111/2041-210x.12841 ◽

2017 ◽

Vol 8 (12) ◽

pp. 1795-1803 ◽

Cited By ~ 32

Author(s):

Sylvain Schmitt ◽

Robin Pouteau ◽

Dimitri Justeau ◽

Florian Boissieu ◽

Philippe Birnbaum

Keyword(s):

Species Richness ◽

Species Distribution ◽

Species Distribution Models ◽

R Package ◽

Distribution Models

Download Full-text

How citizen science could improve Species Distribution Models and their independent assessment

10.1101/2020.06.02.129536 ◽

2020 ◽

Cited By ~ 1

Author(s):

Matutini Florence ◽

Baudry Jacques ◽

Pain Guillaume ◽

Sineau Morgane ◽

Pithon Joséphine

Keyword(s):

Citizen Science ◽

Species Distribution ◽

Cross Validation ◽

Species Distribution Models ◽

External Validation ◽

Receiver Operating Curve ◽

Sampling Effort ◽

Independent Data ◽

External Evaluation ◽

Distribution Models

AbstractSpecies distribution models (SDM) have been increasingly developed in recent years but their validity is questioned. Their assessment can be improved by the use of independent data but this can be difficult to obtain and prohibitive to collect. Standardized data from citizen science may be used to establish external evaluation datasets and to improve SDM validation and applicability. We used opportunistic presence-only data along with presence-absence data from a standardized citizen science program to establish and assess habitat suitability maps for 9 species of amphibian in western France. We assessed Generalized Additive and Random Forest Models’ performance by (1) cross-validation using 30% of the opportunistic dataset used to calibrate the model or (2) external validation using different independent data sets derived from citizen science monitoring. We tested the effects of applying different combinations of filters to the citizen data and of complementing it with additional standardized fieldwork. Cross-validation with an internal evaluation dataset resulted in higher AUC (Area Under the receiver operating Curve) than external evaluation causing overestimation of model accuracy and did not select the same models; models integrating sampling effort performed better with external validation. AUC, specificity and sensitivity of models calculated with different filtered external datasets differed for some species. However, for most species, complementary fieldwork was not necessary to obtain coherent results, as long as the citizen science data was strongly filtered. Since external validation methods using independent data are considered more robust, filtering data from citizen sciences may make a valuable contribution to the assessment of SDM. Limited complementary fieldwork with volunteer’s participation to complete ecological gradients may also possibly enhance citizen involvement and lead to better use of SDM in decision processes for nature conservation.

Download Full-text

The MIGCLIM R package - seamless integration of dispersal constraints into projections of species distribution models

Ecography ◽

10.1111/j.1600-0587.2012.07608.x ◽

2012 ◽

Vol 35 (10) ◽

pp. 872-878 ◽

Cited By ~ 71

Author(s):

Robin Engler ◽

Wim Hordijk ◽

Antoine Guisan

Keyword(s):

Species Distribution ◽

Species Distribution Models ◽

R Package ◽

Distribution Models ◽

Seamless Integration

Download Full-text

The MIAmaxent R package: Variable transformation and model selection for species distribution models

Ecology and Evolution ◽

10.1002/ece3.5654 ◽

2019 ◽

Vol 9 (21) ◽

pp. 12051-12068 ◽

Cited By ~ 4

Author(s):

Julien Vollering ◽

Rune Halvorsen ◽

Sabrina Mazzoni

Keyword(s):

Model Selection ◽

Species Distribution ◽

Species Distribution Models ◽

R Package ◽

Distribution Models ◽

Variable Transformation ◽

Selection For

Download Full-text

Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null model

Ecology ◽

10.1890/11-0826.1 ◽

2012 ◽

Vol 93 (3) ◽

pp. 679-688 ◽

Cited By ~ 302

Author(s):

Robert J. Hijmans

Keyword(s):

Species Distribution ◽

Cross Validation ◽

Species Distribution Models ◽

Null Model ◽

Distribution Models ◽

Spatial Sorting

Download Full-text

RUSBoost: A suitable species distribution method for imbalanced records of presence and absence. A case study of twenty-five species of Iberian bats

10.1101/2021.10.06.463434 ◽

2021 ◽

Author(s):

Jaime Carrasco ◽

Fugencio Lison ◽

Andres Weintraub

Keyword(s):

Species Distribution ◽

Cross Validation ◽

Species Distribution Models ◽

Bayesian Optimization ◽

List Type ◽

Distribution Models ◽

Distribution Method ◽

Explanatory Variables ◽

Presence And Absence

Traditional Species Distribution Models (SDMs) may not be appropriate when examples of one class (e.g. absence or pseudo-absences) greatly outnumber examples of the other class (e.g. presences or observations), because they tend to favor the learning of observations more frequently. We present an ensemble method called Random UnderSampling and Boosting (RUSBoost), which was designed to address the case where the number of presence and absence records are imbalanced, and we opened the "black-box" of the algorithm to interpret its results and applicability in ecology. We applied our methodology to a case study of twenty-five species of bats from theIberian Peninsula and we build a RUSBoost model for each species. Furthermore,in order to improve to build tighter models, we optimized their hyperparametersusing Bayesian Optimization. In particular, we implemented a objective function that represents the cross-validation loss: kFoldLoss(z), with z representing the hyper-parameters Maximum Number of Splits, Number of Learners and Learning Rate. The models reached average values for Area Under the ROC Curve (AUC), specificity, sensitivity, and overall accuracy of 0.84±0.05%, 79.5±4.87%, 74.9±6.05%,and 78.8±5.0%, respectively. We also obtained values of variable importance and we analyzed the relationships between explanatory variables and bat presence probability. The results of our study showed that RUSBoost could be a useful tool to develop SDMs with good performance when the presence/absence databases are imbalanced. The application of this algorithm could improve the prediction of SDMs and help in conservation biology and management.

Download Full-text

Improving the interpretability of species distribution models by using local approximations

10.1101/454991 ◽

2018 ◽

Author(s):

Boyan Angelov

Keyword(s):

Machine Learning ◽

Species Distribution ◽

Species Distribution Models ◽

R Package ◽

Ecological Niches ◽

Distribution Models ◽

Domain Experts ◽

Applied Machine Learning ◽

Black Boxes ◽

Interpretable Model

ABSTRACTSpecies Distribution Models (SDMs) are used to generate maps of realised and potential ecological niches for a given species. As any other machine learning technique they can be seen as “black boxes”, due to a lack of interpretability. Advances in other areas of applied machine learning can be applied to remedy this problem. In this study we test a new tool relying on Local Interpretable Model-agnostic Explanations (LIME) by comparing its results of other known methods and ecological interpretations from domain experts. The findings confirm that LIME provides consistent and ecologically sound explanations of climate feature importance during the training of SDMs, and that the sdmexplain R package can be used with confidence.

Download Full-text

block CV : An r package for generating spatially or environmentally separated folds for k ‐fold cross‐validation of species distribution models