Title: Spatial Datasets for Ecological Modeling
Version: 1.0.0
Description: Provides spatial datasets ready to use for ecological modelling and raster companion data for prediction: Neanderthal presence during the Last Interglacial (Benito et al. 2017 <doi:10.1111/jbi.12845>); Plant diversity metrics for the World's Ecoregions (Maestre et al. 2021 <doi:10.1111/nph.17398>); tree richness across the Americas (Benito et al. 2013 <doi:10.1111/2041-210X.12022>); plant communities from the Sierra Nevada (Spain) with future climate scenarios (Benito et al. 2013 <doi:10.1111/2041-210X.12022>); butterfly-plant interaction data from Sierra Nevada (Spain) (Benito et al. 2011 <doi:10.1007/s10584-010-0015-3>); plant species occurrences in Andalusia (Spain) (Benito et al. 2014 <doi:10.1111/ddi.12148>); presence of the plant Linaria nigricans and greenhouses (Benito et al. 2009 <doi:10.1007/s10531-009-9604-8>); global NDVI and environmental predictors, and European oak species occurrences. All datasets include pre-processed environmental predictors ready for statistical modelling.
URL: https://blasbenito.github.io/spatialData/
BugReports: https://github.com/BlasBenito/spatialData/issues
License: CC BY 4.0
Encoding: UTF-8
RoxygenNote: 7.3.3
Imports: sf, terra
Suggests: spelling, testthat (≥ 3.0.0)
Config/testthat/edition: 3
Language: en-US
Depends: R (≥ 4.1.0)
LazyData: true
LazyDataCompression: xz
NeedsCompilation: no
Packaged: 2026-04-18 19:19:55 UTC; blas
Author: Blas M. Benito [aut, cre]
Maintainer: Blas M. Benito <blasbenito@gmail.com>
Repository: CRAN
Date/Publication: 2026-04-21 19:50:32 UTC

Presence records of 90 plant species and background points from Andalusia, Spain

Description

sf long format data frame with POINT geometry and CRS ETRS89 / UTM zone 30N (EPSG:25830), containing 37,773 presence records for 90 plant species and 8,692 background points (46,465 rows total) from Andalusia, Spain.

The dataset contains 3 columns (species, presence, geometry). Environmental predictors for each point can be extracted from the companion raster returned by andalusia_extra(). Predictor names are stored in andalusia_predictors.

Usage

data(andalusia)

Format

An sf data frame with 46,465 rows (presences and background points) and 3 columns:

Source

Published study:

Landsat imagery:

Climate variables:

Topography:

See Also

Other andalusia: andalusia_extra(), andalusia_predictors, andalusia_responses

Examples

data(andalusia)
colnames(andalusia)
nrow(andalusia)
ncol(andalusia)

Environmental raster for the dataset andalusia

Description

Downloads and reads the 20-band environmental raster associated with the andalusia dataset from the spatialDataExtra repository. The raster covers Andalusia, Spain, at 400 m resolution (EPSG:25830) and includes remote-sensing, climate, and topographic predictors (see andalusia).

Usage

andalusia_extra()

Value

SpatRaster object with 20 layers.

See Also

Other andalusia: andalusia, andalusia_predictors, andalusia_responses


Predictor names for the dataset andalusia

Description

Character vector of 20 predictor variable names corresponding to the layers of the environmental raster returned by andalusia_extra(), covering Landsat reflectance (7), rainfall (2), solar radiation (2), temperature (4), and topography (5). These are not columns in andalusia; use terra::extract() to attach them to the point data.

Usage

data(andalusia_predictors)

Format

Character vector of length 20.

See Also

Other andalusia: andalusia, andalusia_extra(), andalusia_responses

Examples

data(andalusia_predictors)
andalusia_predictors

Response names for the dataset andalusia

Description

Character vector of length 2 containing the names of the response variables in andalusia: "species" (character, species name or "background" for 90 species plus background points) and "presence" (binary integer, 1 = confirmed species presence, 0 = background point).

Usage

data(andalusia_responses)

Format

Character vector of length 2.

See Also

Other andalusia: andalusia, andalusia_extra(), andalusia_predictors

Examples

data(andalusia_responses)
andalusia_responses

Plant Communities of Sierra Nevada (Spain)

Description

sf data frame with POINT geometry containing 6,747 plant community records from the Sierra Nevada mountain range (SE Spain), with 6 response variables (see communities_responses) and 9 numeric predictors (see communities_predictors). Use communities_extra_2010(), communities_extra_2050(), and communities_extra_2100() to download the associated environmental rasters for the baseline (2010), 2050, and 2100 climate scenarios.

Usage

data(communities)

Format

An sf data frame with 6,747 rows and 16 columns:

Response variables (6):

Predictor variables:

Geometry:

Source

See Also

Other communities: communities_extra_2010(), communities_extra_2050(), communities_extra_2100(), communities_predictors, communities_responses

Examples

data(communities)
colnames(communities)
nrow(communities)
ncol(communities)

Download Environmental Raster for communities - 2010

Description

Downloads the baseline (2010) environmental raster associated with the communities dataset from the spatialDataExtra repository. Writes the file communities_2010.tif in the working directory and returns it as a spatRaster object.

Usage

communities_extra_2010()

Value

spatRaster object.

See Also

Other communities: communities, communities_extra_2050(), communities_extra_2100(), communities_predictors, communities_responses


Download Environmental Raster for communities - 2050

Description

Downloads the future climate (2050) raster associated with the communities dataset from the spatialDataExtra repository. Writes the file communities_2050.tif in the working directory and returns it as a spatRaster object.

Usage

communities_extra_2050()

Value

SpatRaster object.

See Also

Other communities: communities, communities_extra_2010(), communities_extra_2100(), communities_predictors, communities_responses


Download Environmental Raster for communities - 2100

Description

Downloads the future climate (2100) raster associated with the communities dataset from the spatialDataExtra repository. Writes the file communities_2100.tif in the working directory and returns it as a spatRaster object.

Usage

communities_extra_2100()

Value

SpatRaster object.

See Also

Other communities: communities, communities_extra_2010(), communities_extra_2050(), communities_predictors, communities_responses


Predictor variable names for the dataset communities

Description

Character vector of 9 predictor variable names from communities.

Usage

data(communities_predictors)

Format

A character vector of length 9.

See Also

Other communities: communities, communities_extra_2010(), communities_extra_2050(), communities_extra_2100(), communities_responses

Examples

data(communities_predictors)
communities_predictors

Response variable names for the dataset communities

Description

Character vector of 6 response variable names from communities.

Usage

data(communities_responses)

Format

A character vector of length 6.

See Also

Other communities: communities, communities_extra_2010(), communities_extra_2050(), communities_extra_2100(), communities_predictors

Examples

data(communities_responses)
communities_responses

Butterfly and host plant presence in Sierra Nevada (SE Spain)

Description

sf dataframe with co-occurrence records for a butterfly and its host plant in Sierra Nevada (SE Spain). Contains 3 response variables (see interaction_responses) and 10 numeric predictors at 100 m resolution (see interaction_predictors). Use interaction_extra() to download the associated environmental raster.

Usage

data(interaction)

Format

An sf dataframe with 1,000 rows (presence and background points) and 14 columns:

Response variables (3):

Predictor variables:

Geometry:

Source

Species occurrences:

Remote sensing data:

Climate and topographic variables:

See Also

Other interaction: interaction_extra(), interaction_predictors, interaction_responses

Examples

data(interaction)
colnames(interaction)
nrow(interaction)
ncol(interaction)

Download Environmental Raster for interaction

Description

Downloads and reads the environmental raster associated with the interaction dataset from the spatialDataExtra repository. Requires the terra package. Writes the file sierra_nevada_env.tif to the working directory and returns a spatRaster object.

Usage

interaction_extra()

Value

SpatRaster object.

See Also

Other interaction: interaction(), interaction_predictors, interaction_responses


Predictor variable names for interaction dataset

Description

Character vector of 10 predictor variable names from interaction.

Usage

data(interaction_predictors)

Format

A character vector of length 10.

See Also

Other interaction: interaction(), interaction_extra(), interaction_responses

Examples

data(interaction_predictors)
interaction_predictors

Response variable names for the dataset interaction

Description

Character vector of 3 response variable names from interaction.

Usage

data(interaction_responses)

Format

A character vector of length 3.

See Also

Other interaction: interaction(), interaction_extra(), interaction_predictors

Examples

data(interaction_responses)
interaction_responses

Presence of Linaria nigricans and greenhouses in Eastern Andalusia

Description

sf data frame with POINT geometry containing presence records of the plant Linaria nigricans, greenhouses, and background points from Eastern Andalusia (Spain). The dataframe contains 2 response variables (see linaria_responses), and 20 numeric predictors (see linaria_predictors). Use linaria_extra() to download the associated environmental raster.

The dataset combines species presence records, greenhouse presence records (representing a competing land use), and randomly sampled background points. Species presences and greenhouse presences were spatially thinned at 400 m to remove redundancy at the raster resolution. Background points were randomly sampled within the extent of the presence records. Environmental predictors were extracted from a Landsat/DEM-derived raster at 400 m resolution (EPSG:25830).

Usage

data(linaria)

Format

An sf data frame with 7386 rows (presences and background points) and 25 columns:

Response variables:

Predictor variables:

Geometry:

Source

Published studies:

Landsat imagery:

Climate variables:

Topography:

See Also

Other linaria: linaria_extra(), linaria_predictors, linaria_responses

Examples

data(linaria)
colnames(linaria)
nrow(linaria)
ncol(linaria)

Download Environmental Raster for linaria

Description

Downloads and reads the 20-band environmental raster associated with the linaria dataset from the spatialDataExtra repository. Writes the file linaria_env.tif in the working directory and returns it as a spatRaster object.

Usage

linaria_extra()

Value

SpatRaster object with 20 layers.

See Also

Other linaria: linaria, linaria_predictors, linaria_responses


Predictor variable names for the dataset linaria

Description

Character vector of 20 predictor variable names from linaria, covering Landsat reflectance (7), rainfall (2), solar radiation (2), temperature (4), and topography (5).

Usage

data(linaria_predictors)

Format

A character vector of length 20.

See Also

Other linaria: linaria, linaria_extra(), linaria_responses

Examples

data(linaria_predictors)
linaria_predictors

Response variable names for the dataset linaria

Description

Character vector of length 2 containing the names of the response variables in linaria.

Usage

data(linaria_responses)

Format

A character vector of length 2.

See Also

Other linaria: linaria, linaria_extra(), linaria_predictors

Examples

data(linaria_responses)
linaria_responses

Neanderthal presence in the Last Interglacial

Description

sf data frame with POINT geometry containing 227 records of Neanderthal presence from Marine Isotope Stage 5e (Last Interglacial) in Europe and the Near East, 1 response variable (see neanderthal_response), and 25 predictors (see neanderthal_predictors). Use neanderthal_extra() to download the associated environmental raster.

Usage

data(neanderthal)

Format

An sf data frame with 227 rows (presence and pseudo-absence sites) and 27 columns:

Response variable (1):

Predictor variables:

Bioclimatic variables derived from a Last Interglacial GCM simulation (Otto-Bliesner et al. 2006), downscaled following the method of Hijmans et al. (2005). These are analogous to the standard WorldClim bioclimatic variables but represent Last Interglacial (MIS 5e) conditions rather than modern climate:

Geometry:

Source

Presence data:

Palaeoclimatic variables (GCM simulation):

Palaeoclimatic variables (interpolation):

Elevation and topography:

See Also

Other neanderthal: neanderthal_extra(), neanderthal_predictors, neanderthal_response

Examples

data(neanderthal)
colnames(neanderthal)
nrow(neanderthal)
ncol(neanderthal)

Download Environmental Raster for neanderthal

Description

Downloads and reads the environmental raster associated with the neanderthal dataset from the spatialDataExtra repository. Writes the file neanderthal_env.tif to the working directory and returns it as a spatRaster object.

Usage

neanderthal_extra()

Value

SpatRaster object.

See Also

Other neanderthal: neanderthal, neanderthal_predictors, neanderthal_response


Predictor variable names for the dataset neanderthal

Description

Character vector of 25 predictor variable names from neanderthal.

Usage

data(neanderthal_predictors)

Format

A character vector of length 25.

See Also

Other neanderthal: neanderthal, neanderthal_extra(), neanderthal_response

Examples

data(neanderthal_predictors)
neanderthal_predictors

Response variable name for the dataset neanderthal

Description

Character string with the name of the response variable in neanderthal.

Usage

data(neanderthal_response)

Format

A character string of length 1.

See Also

Other neanderthal: neanderthal, neanderthal_extra(), neanderthal_predictors

Examples

data(neanderthal_response)
neanderthal_response

Plant diversity metrics for the World's Ecoregions

Description

Plant diversity metrics (richness, rarity, beta diversity) obtained from GBIF Plantae records for the World's Ecoregions. Includes metrics for all plants, trees, and grasses at species, genus, and family taxonomic levels. Ecoregion boundaries are derived from Ecoregions 2017. Original polygon geometries have been converted to point centroids to reduce file size while preserving spatial context. Use plantae_extra() to download a full version with original polygon geometries. The datasets plantae_west and plantae_east are subsets of plantae focused on overall plant richness of the western and easter hemispheres, respectively.

The GBIF download comprised 244,830,168 records from 4,741 datasets, filtered to records with coordinates, no geospatial issues, and occurrence status "present".

Tree species were identified by cross-referencing GBIF records with the BGCI Global Tree Search database (BGCI 2020). Grasses were defined as members of family Poaceae.

Rarity-weighted richness was computed for each taxon as the inverse of its number of spatial presence records in GBIF, then scores are summed per ecoregion, while mean rarity is the mean of these inverse presence record counts per taxon in an ecoregion.

Beta diversity was computed between each ecoregion and its immediate neighboring ecoregions via Sorensen dissimilarity (⁠Bsor = 1 - 2a/(2a+b+c)⁠) and Simpson dissimilarity (Bsim = min(b,c)/(min(b,c)+a)), following Koleff et al. (2003).

Fragmentation metrics were computed with the R package landscapemetrics (Hesselbarth et al. 2019) at 5 km resolution in Lambert Azimuthal Equal-Area projection.

Climate hypervolume was computed using hypervolume::hypervolume_svm() from the climate predictors.

Aridity is computed as 1 minus the aridity index of Trabucco and Zomer (2019), so maximum aridity is coded as 1.

Environmental predictors were extracted as mean pixel values per ecoregion from rasters at 1 km resolution.

Usage

data(plantae)

Format

An sf data frame with 662 rows (ecoregions) and 143 columns:

Identifier columns:

Response variables - Richness (9):

Response variables - Rarity-weighted richness (6):

Response variables - Mean rarity (6):

Response variables - Beta diversity R (absolute richness difference) (16):

Response variables - Beta diversity Sorensen (8):

Response variables - Beta diversity Simpson (8):

Predictor variables:

Geometry:

Source

Associated publications:

See Also

Other plantae: plantae_east, plantae_extra(), plantae_predictors, plantae_responses, plantae_west

Examples

data(plantae)
colnames(plantae)
nrow(plantae)
ncol(plantae)

Eastern Hemisphere subset of plantae

Description

Subset of the plantae dataset filtered to non-American ecoregions (ecoregion_continent != "Americas") with richness_species (overall plant species richness) as the only response variable. All 84 predictor variables and identifier columns in plantae are retained.

Usage

data(plantae_east)

Format

An sf data frame with 434 rows and 91 columns.

See Also

Other plantae: plantae, plantae_extra(), plantae_predictors, plantae_responses, plantae_west

Examples

data(plantae_east)
colnames(plantae_east)
nrow(plantae_east)
ncol(plantae_east)

Download Extended plantae Dataset

Description

Downloads and reads the extended version of the plantae dataset with original polygon geometries instead of point centroids, from the spatialDataExtra repository. Writes the file plantae.gpkg to the working directory and returns it as an sf dataframe. See plantae for details on the response variables, predictors, and data sources.

Usage

plantae_extra()

Value

sf dataframe with 662 rows and 143 columns (MULTIPOLYGON geometry, WGS84).

See Also

Other plantae: plantae, plantae_east, plantae_predictors, plantae_responses, plantae_west


Predictor variable names for the dataset plantae

Description

Character vector of 84 predictor variable names from plantae.

Usage

data(plantae_predictors)

Format

A character vector of length 84.

See Also

Other plantae: plantae, plantae_east, plantae_extra(), plantae_responses, plantae_west

Examples

data(plantae_predictors)
plantae_predictors

Response variable names for the dataset plantae

Description

Character vector containing the names of the 53 response variables in plantae.

Usage

data(plantae_responses)

Format

A character vector of length 53.

See Also

Other plantae: plantae, plantae_east, plantae_extra(), plantae_predictors, plantae_west

Examples

data(plantae_responses)
plantae_responses

Western Hemisphere subset of plantae

Description

Subset of the plantae dataset filtered to American ecoregions (ecoregion_continent == "Americas") with richness_species (overall plant species richness) as the only response variable. All 84 predictor variables and identifier columns in plantae are retained.

Usage

data(plantae_west)

Format

An sf data frame with 228 rows and 91 columns.

See Also

Other plantae: plantae, plantae_east, plantae_extra(), plantae_predictors, plantae_responses

Examples

data(plantae_west)
colnames(plantae_west)
nrow(plantae_west)
ncol(plantae_west)

European Quercus (Oak) Species Distribution with Environmental Predictors

Description

sf data frame with POINT geometry containing 6,728 records of eight European Quercus (oak) species and absence points, 1 response variable (see quercus_response), and 31 numeric predictors (see quercus_predictors). Use quercus_extra() to download the associated environmental raster.

Usage

data(quercus)

Format

An sf data frame with 6728 rows (species occurrences and absences) and 33 columns:

Response variable:

Predictor variables:

WorldClim v2 bioclimatic variables (excludes bio8 and bio9):

Geometry:

Source

Species occurrences:

Bioclimatic variables and solar radiation:

NDVI:

Land cover:

Elevation and topography:

Human footprint:

See Also

Other quercus: quercus_extra(), quercus_predictors, quercus_response

Examples

data(quercus)
colnames(quercus)
nrow(quercus)
ncol(quercus)

Download Environmental Raster for quercus

Description

Downloads and reads the environmental raster associated with the quercus dataset from the spatialDataExtra repository. Writes the file quercus_env.tif to the working directory, and returns it as a spatRaster object.

Usage

quercus_extra()

Value

SpatRaster object.

See Also

Other quercus: quercus, quercus_predictors, quercus_response


Predictor variable names for for the dataset quercus

Description

Character vector of 31 predictor variable names from quercus.

Usage

data(quercus_predictors)

Format

A character vector of length 31.

See Also

Other quercus: quercus, quercus_extra(), quercus_response

Examples

data(quercus_predictors)
quercus_predictors

Response variable name for the dataset quercus

Description

Character string with the name of the response variable in quercus.

Usage

data(quercus_response)

Format

A character string of length 1.

See Also

Other quercus: quercus, quercus_extra(), quercus_predictors

Examples

data(quercus_response)
quercus_response

Mesoamerican tree species richness

Description

sf data frame with POLYGON geometry representing 3,373 hexagonal grid cells across the Americas, with 1 response variable encoding tree species richness and 50 numeric environmental predictors.

Tree species in this dataset does NOT represent total tree species counts! The dataset focuses on the tree species found in Mesoamerica according to the Tree Biodiversity Network (BIOTREE-NET; Cayuela et al. 2012). These tree species were later used as input for a search query at the Global Biodiversity Information Facility (GBIF). The resulting presence data and environmental data at 1km resolution were aggregated as a hexagonal grid.

The hexagonal grid was constructed using sf::st_make_grid(..., cellsize = 1, square = FALSE) at 1-degree resolution (WGS84, EPSG:4326), covering longitudes -125.3° to -34.3° and latitudes -34.4° to 49.9°.

Usage

data(trees)

Format

An sf data frame with 3373 rows (hexagonal cells) and 53 columns:

Identifier (1):

Response variable (1):

Predictor variables:

Geometry:

Source

Dataset publication:

Response variable (tree species richness):

Climate predictors (temperature, precipitation, air humidity, cloud cover, evapotranspiration):

Aridity:

Soil properties:

Soil temperature:

NDVI:

Elevation and geography:

See Also

Other trees: trees_extra(), trees_predictors, trees_response

Examples

data(trees)
colnames(trees)
nrow(trees)
ncol(trees)

Download Presence Records for trees

Description

Downloads and reads an sf dataframe with the tree species presence records associated with the trees dataset from the spatialDataExtra repository. Writes the file trees_presence.gpkg to the working folder, and returns it as an sf dataframe.

Usage

trees_extra()

Value

sf data frame with POINT geometry (WGS84, EPSG:4326) and columns species and source.

See Also

Other trees: trees, trees_predictors, trees_response


Predictor variable names for the dataset trees

Description

Character vector of 50 predictor variable names from trees.

Usage

data(trees_predictors)

Format

A character vector of length 50.

See Also

Other trees: trees, trees_extra(), trees_response

Examples

data(trees_predictors)
trees_predictors

Response variable name for the dataset trees

Description

Character vector of length 1 containing the name of the response variable in trees.

Usage

data(trees_response)

Format

A character vector of length 1.

See Also

Other trees: trees, trees_extra(), trees_predictors

Examples

data(trees_response)
trees_response

Global long-term NDVI records and environmental predictors

Description

sf data frame with POINT geometry representing 9,265 global locations with one response variable represented in five different encodings of the long-term average (1999-2019) of the Normalized Difference Vegetation Index (NDVI) and 58 environmental predictors (47 numeric, 11 categorical). Use vi_extra() to download an extended version with 30,000 rows. There is a smaller version of this dataset (580 rows) named vi_smol

NDVI values are derived from the Copernicus Global Land Service Long Term Statistics product (1999-2019) at 1 km resolution. Locations were spatially thinned to reduce spatial autocorrelation.

Environmental predictors were extracted as pixel values from normalized raster data at 1 km resolution.

Usage

data(vi)

Format

An sf data frame with 9265 rows (locations) and 64 columns:

Response variables (5):

Predictor variables:

Geometry:

Source

Response variables (NDVI):

Climate classification:

Soil water index:

Climate predictors (temperature, rainfall, solar radiation, growing season, evapotranspiration, cloud cover, humidity):

Soil type and properties:

Soil temperature:

Ecoregions and biogeography:

Elevation and topography:

Aridity index:

Country, continent, region, and subregion:

See Also

Other vi: vi_extra(), vi_predictors, vi_responses, vi_smol

Examples

data(vi)
colnames(vi)
nrow(vi)
ncol(vi)

Download extended vi dataset

Description

Downloads and reads the extended version of the vi dataset (30,000 rows) from the spatialDataExtra repository. Writes the file vi.gpkg to the working directory, and returns it as an sf dataframe. See vi for details on the response variables, predictors, and data sources.

Usage

vi_extra()

Value

sf data.frame with 30,000 rows and 64 columns (POINT geometry, WGS84).

See Also

Other vi: vi, vi_predictors, vi_responses, vi_smol


Predictor variable names for the dataset vi

Description

Character vector of 58 predictor variable names from vi.

Usage

data(vi_predictors)

Format

A character vector of length 58.

See Also

Other vi: vi, vi_extra(), vi_responses, vi_smol

Examples

data(vi_predictors)
vi_predictors

Response variable names for the dataset vi

Description

Character vector containing the names of the 5 response variables in vi.

Usage

data(vi_responses)

Format

A character vector of length 5.

See Also

Other vi: vi, vi_extra(), vi_predictors, vi_smol

Examples

data(vi_responses)
vi_responses

Small version of vi

Description

Same as dataset vi, but with only 580 rows.

Usage

data(vi_smol)

Format

A data frame with 580 rows and 65 columns.

See Also

vi

Other vi: vi, vi_extra(), vi_predictors, vi_responses

Examples

data(vi_smol)
colnames(vi_smol)
nrow(vi_smol)
ncol(vi_smol)