Skip to content
Sam Comber edited this page Jun 12, 2020 · 16 revisions

What Is Spacv?

Spacv is a spatial machine learning library for the cross-validation of techniques that assess generalization performance to datasets with structural dependence. The library provides

What Is Spacv For?

The intended uses of Spacv are

  • test
  • test
  • test

What Problem Does Spacv Solve?

Datasets with correlation structures are commonplace in spatial statistical applications. Values of nearby observations are more similar than distant observations, and this underlying dependence structure is problematic for model validation, selection and predictive error. Cross-validation procedures typically split the initial dataset into

  • h-blocking

  • Spatial Leave One Out (SLOO)

  • Rabinowicz' bias-corrected CV measure

  • Area of applicability for spatial prediction

How Does Spacv Accomplish Its Goals?

Spacv provides a sklearn-like interface for cross-validation exercises with correlated data.

  • test
  • test
  • test

Existing software

Related literature

  • Airola, A., Pohjankukka, J., Torppa, J. et al. The spatial leave-pair-out cross-validation method for reliable AUC estimation of spatial classifiers. Data Min Knowl Disc 33, 730–747 (2019). https://doi.org/10.1007/s10618-018-00607-x

  • Hijmans, R.J. (2012), Cross‐validation of species distribution models: removing spatial sorting bias and calibration with a null model. Ecology, 93: 679-688. doi:10.1890/11-0826.1

  • Meyer, Hanna & Reudenbach, Christoph & Hengl, Tomislav & Katurji, Marwan & Nauss, Thomas. (2018). Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environmental Modelling & Software. 101. 1 - 9. 10.1016/j.envsoft.2017.12.001.

  • Meyer, H. and Pebesma (2020) Predicting into unknown space? Estimating the area of applicability of spatial prediction models. https://arxiv.org/abs/2005.07939

  • Pohjankukka, J., Pahikkala, T., Nevalainen P. & Heikkonen, J. (2017) Estimating the prediction performance of spatial models via spatial k-fold cross validation, International Journal of Geographical Information Science, 31:10, 2001-2019, DOI: 10.1080/13658816.2017.1346255

  • Rabinowicz, Assaf & Rosset, Saharon. (2019). Cross-Validation for Correlated Data. https://arxiv.org/abs/1904.02438

  • Rest, K., Pinaud, D., Monestiez, P., Chadoeuf, J. and Bretagnolle, V. (2014), Spatial leave‐one‐out cross‐validation. Global Ecology and Biogeography, 23: 811-820. doi:10.1111/geb.12161

  • Roberts, D.R., Bahn, V., Ciuti, S., Boyce, M.S., Elith, J., Guillera‐Arroita, G., Hauenstein, S., Lahoz‐Monfort, J.J., Schröder, B., Thuiller, W., Warton, D.I., Wintle, B.A., Hartig, F. and Dormann, C.F. (2017), Cross‐validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography, 40: 913-929. doi:10.1111/ecog.02881

  • Schratz, P., Muenchow, J., Iturritxa, E., Richter, J., Brenning, A. (2019). Hyperparameter tuning and performance assessment of statistical and machine-learning algorithms using spatial data. Ecological Modelling. 406. 109-120. 10.1016/j.ecolmodel.2019.06.002.

  • Valavi, R, Elith, J, Lahoz‐Monfort, JJ, Guillera‐Arroita, G. block CV: An r package for generating spatially or environmentally separated folds for k ‐fold cross‐validation of species distribution models. Methods Ecol Evol. 2019; 10: 225– 232. https://doi.org/10.1111/2041-210X.13107

Clone this wiki locally