Type: Package
Title: Spatial Analysis with Misaligned Data Using Atom-Based Regression Models
Version: 0.2.5
Date: 2026-02-08
Description: Implements atom-based regression models (ABRM) for analyzing spatially misaligned data. Provides functions for simulating misaligned spatial data, preparing NIMBLE model inputs, running MCMC diagnostics, and comparing different spatial analysis methods including dasymetric mapping. All main functions return S3 objects with print(), summary(), and plot() methods for intuitive result exploration. Methods are described in Nethery et al. (2023) <doi:10.1101/2023.01.10.23284410>. Further methodological details and software implementation are described in Qian et al. (in review).
License: MIT + file LICENSE
Encoding: UTF-8
Depends: R (≥ 4.0.0), nimble
Imports: sp, sf, spdep, MASS, raster, dplyr, tidyr, ggplot2, reshape2, coda, BiasedUrn, stats, utils, grDevices, methods
Suggests: testthat (≥ 3.0.0), knitr, rmarkdown
VignetteBuilder: knitr
URL: https://github.com/bellayqian/spatialAtomizeR
BugReports: https://github.com/bellayqian/spatialAtomizeR/issues
RoxygenNote: 7.3.3
NeedsCompilation: no
Packaged: 2026-02-09 03:27:58 UTC; anemos
Author: Yunzhe Qian [aut, cre], Rachel Nethery [aut], Nancy Krieger [ctb] (Contributed to the project conceptualization and manuscript), Nykesha Johnson [ctb] (Contributed to the project conceptualization and manuscript)
Maintainer: Yunzhe Qian <qyzanemos@gmail.com>
Repository: CRAN
Date/Publication: 2026-02-09 03:50:02 UTC

spatialAtomizeR: Spatial Analysis with Misaligned Data Using Atom-Based Regression Models

Description

logo

Implements atom-based regression models (ABRM) for analyzing spatially misaligned data. Provides functions for simulating misaligned spatial data, preparing NIMBLE model inputs, running MCMC diagnostics, and comparing different spatial analysis methods including dasymetric mapping. All main functions return S3 objects with print(), summary(), and plot() methods for intuitive result exploration. Methods are described in Nethery et al. (2023) doi:10.1101/2023.01.10.23284410. Further methodological details and software implementation are described in Qian et al. (in review).

Implements atom-based Bayesian regression methods (ABRM) for spatial data with misaligned grids.

Main Functions

simulate_misaligned_data

Generate simulated spatial data

get_abrm_model

Get NIMBLE model code for ABRM

run_abrm

Run atom-based Bayesian regression model

Author(s)

Maintainer: Yunzhe Qian qyzanemos@gmail.com

Authors:

Other contributors:

See Also

Useful links:


Nimble R Call Wrapper for BiasedUrn

Description

Internal wrapper to call R function from compiled Nimble code.

Usage

Rmfnchypg(total, odds, ni)

Arguments

total

Total number of items

odds

Vector of odds

ni

Vector of category sizes

Value

Vector of sampled counts


R Wrapper Function for BiasedUrn Sampling

Description

Wraps the BiasedUrn::rMFNCHypergeo function for use in NIMBLE models

Usage

biasedUrn_rmfnc(total, odds, ni)

Arguments

total

Integer, total number of items to sample

odds

Numeric vector of odds for each category

ni

Integer vector of population sizes

Value

Numeric vector of sampled counts


Dasymetric Mapping

Description

Maps X-grid covariates to Y-grid using centroid-based spatial join

Usage

dasymetric_mapping(misaligned_data)

Arguments

misaligned_data

List with gridx and gridy from simulate_misaligned_data

Value

sf object with Y grid containing mapped X covariates


Density function for multivariate non-central hypergeometric

Description

Density function for multivariate non-central hypergeometric

Usage

dmfnchypg(x, total, odds, ni, log = 0)

Arguments

x

Vector of counts

total

Total number of items

odds

Vector of odds

ni

Vector of category sizes

log

Logical, return log probability

Value

The log-probability (if log=1) or probability (if log=0)


Fit Dasymetric Model

Description

Fits regression model to dasymetrically mapped data

Usage

fit_dasymetric_model(mapped_data, outcome_type)

Arguments

mapped_data

Output from dasymetric_mapping

outcome_type

Distribution type: 'normal', 'poisson', or 'binomial'

Value

Data frame with parameter estimates and confidence intervals


Generate Correlated Spatial Effects

Description

Generate Correlated Spatial Effects

Usage

gen_correlated_spat(
  W,
  n_vars,
  rho = 0.6,
  var_spat = 1,
  correlation = 0.5,
  verify = FALSE
)

Arguments

W

Spatial adjacency matrix

n_vars

Number of variables

rho

Spatial correlation parameter (default = 0.6)

var_spat

Spatial variance (default = 1)

correlation

Correlation between variables (default = 0.5)

verify

Logical for verification (default = FALSE)

Value

Matrix of spatial effects


Get ABRM Model Code for NIMBLE

Description

Returns the NIMBLE code for the Atom-Based Regression Model with mixed-type variables. Automatically registers custom distributions if not already registered.

Usage

get_abrm_model()

Value

A nimbleCode object containing the model specification


Register Custom NIMBLE Distributions

Description

Registers the custom distributions for use in NIMBLE models.

Usage

register_nimble_distributions()

Value

Invisible TRUE


Random generation for multivariate non-central hypergeometric

Description

Random generation for multivariate non-central hypergeometric

Usage

rmfnchypg(n, total, odds, ni)

Arguments

n

number of observations (only n=1 is used)

total

Total number of items

odds

Vector of odds

ni

Vector of category sizes

Value

Vector of sampled counts


Run ABRM Analysis

Description

Runs the Atom-Based Regression Model on simulated data

Usage

run_abrm(
  gridx,
  gridy,
  atoms,
  model_code,
  true_params = NULL,
  norm_idx_x = NULL,
  pois_idx_x = NULL,
  binom_idx_x = NULL,
  norm_idx_y = NULL,
  pois_idx_y = NULL,
  binom_idx_y = NULL,
  dist_y = 2,
  niter = 50000,
  nburnin = 30000,
  nchains = 2,
  thin = 10,
  sim_metadata = NULL,
  save_plots = TRUE,
  output_dir = NULL
)

Arguments

gridx

The X-grid sf dataframe, containing a numeric area ID variable named 'ID' and covariates named 'covariate_x_1','covariate_x_2',...

gridy

The Y-grid sf dataframe, containing a numeric area ID variable named 'ID', covariates named 'covariate_y_1','covariate_y_2',...., and an outcome named 'y'.

atoms

The atom sf dataframe, which should contain numeric variables named 'ID_x' and 'ID_y' holding the X-grid and Y-grid cell IDs for each atom, as well as an atom-level population count named 'population'.

model_code

NIMBLE model code from get_abrm_model()

true_params

The true outcome model regression coefficient parameters, if known (e.g., from simulate_misaligned_data())

norm_idx_x

Vector of numeric indices of X-grid covariates (ordered as 'covariate_x_1','covariate_x_2',...) that should be treated as normally-distributed

pois_idx_x

Vector of numeric indices of X-grid covariates (ordered as 'covariate_x_1','covariate_x_2',...) that should be treated as Poisson-distributed

binom_idx_x

Vector of numeric indices of X-grid covariates (ordered as 'covariate_x_1','covariate_x_2',...) that should be treated as binomial-distributed

norm_idx_y

Vector of numeric indices of Y-grid covariates (ordered as 'covariate_y_1','covariate_y_2',...) that should be treated as normally-distributed

pois_idx_y

Vector of numeric indices of Y-grid covariates (ordered as 'covariate_y_1','covariate_y_2',...) that should be treated as Poisson-distributed

binom_idx_y

Vector of numeric indices of Y-grid covariates (ordered as 'covariate_y_1','covariate_y_2',...) that should be treated as binomial-distributed

dist_y

Distribution type for outcome (1=normal, 2=poisson, 3=binomial)

niter

Number of MCMC iterations (default: 50000)

nburnin

Number of burn-in iterations (default: 30000)

nchains

Number of MCMC chains (default: 2)

thin

Thinning interval (default: 10)

sim_metadata

Optional simulation metadata list

save_plots

Logical, whether to save diagnostic plots (default: TRUE)

output_dir

Directory for saving outputs (default: NULL)

Value

List containing MCMC results and parameter estimates


Run Both Methods and Compare

Description

Runs both ABRM and dasymetric mapping methods and compares results

Usage

run_both_methods(
  sim_data,
  sim_metadata,
  model_code,
  nimble_params,
  output_dir,
  norm_idx_x,
  pois_idx_x,
  binom_idx_x,
  norm_idx_y,
  pois_idx_y,
  binom_idx_y,
  dist_y,
  outcome_type
)

Arguments

sim_data

List of data elements to be used in the ABRM, structured like the output from the simulate_misaligned_data() function. The first element of this list is the Y-grid sf dataframe (named 'gridy'), containing a numeric area ID variable named 'ID_y', covariates named 'covariate_y_1','covariate_y_2',...., and an outcome named 'y'. The second element of this list is the X-grid sf dataframe (named 'gridx'), containing a numeric area ID variable named 'ID_x' and covariates named 'covariate_x_1','covariate_x_2',... The third element of the list is the atom sf dataframe (named 'atoms'), which should contain variables named 'ID_x' and 'ID_y' holding the X-grid and Y-grid cell IDs for each atom, as well as an atom-level population count named 'population'.

sim_metadata

Simulation metadata

model_code

NIMBLE model code

nimble_params

List of NIMBLE parameters (niter, nburnin, thin, nchains)

output_dir

Output directory

norm_idx_x

Indices of normal X covariates

pois_idx_x

Indices of Poisson X covariates

binom_idx_x

Indices of binomial X covariates

norm_idx_y

Indices of normal Y covariates

pois_idx_y

Indices of Poisson Y covariates

binom_idx_y

Indices of binomial Y covariates

dist_y

Distribution type for outcome (1=normal, 2=poisson, 3=binomial)

outcome_type

Outcome distribution name

Value

List with combined comparison, ABRM results, and dasymetric results


Run NIMBLE Model with Diagnostics

Description

Run NIMBLE Model with Diagnostics

Usage

run_nimble_model(
  constants,
  data,
  inits,
  sim_metadata = NULL,
  model_code,
  niter = 50000,
  nburnin = 30000,
  nchains = 2,
  thin = 10,
  save_plots = TRUE,
  output_dir = NULL
)

Arguments

constants

List of model constants

data

List of data

inits

List of initial values

sim_metadata

List with simulation metadata (optional)

model_code

NIMBLE code object

niter

Number of MCMC iterations (default: 50000)

nburnin

Number of burn-in iterations (default: 30000)

nchains

Number of MCMC chains (default: 2)

thin

Thinning interval (default: 10)

save_plots

Logical, whether to save diagnostic plots (default: TRUE)

output_dir

Directory for saving plots (default: NULL)

Value

List containing MCMC samples, summary, and convergence diagnostics


Run Sensitivity Analysis

Description

Performs sensitivity analysis across different correlation structures

Usage

run_sensitivity_analysis(
  correlation_grid = c(0.2, 0.6),
  n_sims_per_setting = 3,
  base_params = list(dist_covariates_x = c("normal", "poisson", "binomial"),
    dist_covariates_y = c("normal", "poisson", "binomial"), dist_y = "poisson",
    x_intercepts = c(4, -1, -1), y_intercepts = c(4, -1, -1), beta0_y = -1, beta_x =
    c(-0.03, 0.1, -0.2), beta_y = c(0.03, -0.1, 0.2)),
  mcmc_params = list(niter = 50000, nburnin = 30000, thin = 10, nchains = 2),
  model_code,
  base_seed = 123,
  output_dir = NULL
)

Arguments

correlation_grid

Vector of correlation values to test

n_sims_per_setting

Number of simulations per correlation setting

base_params

List of base simulation parameters

mcmc_params

List of MCMC parameters

model_code

NIMBLE model code

base_seed

Base random seed

output_dir

Output directory for results (default: NULL, uses tempdir())

Value

List with combined results, summary statistics, and output directory


Simulate Misaligned Spatial Data

Description

Simulate Misaligned Spatial Data

Usage

simulate_misaligned_data(
  seed = 2,
  dist_covariates_x = c("normal", "poisson", "binomial"),
  dist_covariates_y = c("normal", "poisson", "binomial"),
  dist_y = "poisson",
  x_intercepts = rep(0, 3),
  y_intercepts = rep(0, 3),
  rho_x = 0.6,
  rho_y = 0.6,
  x_correlation = 0.5,
  y_correlation = 0.5,
  beta0_y = NULL,
  beta_x = NULL,
  beta_y = NULL,
  diff_pops = TRUE,
  xy_cov_cor = FALSE
)

Arguments

seed

Random seed (default = 2)

dist_covariates_x

Vector specifying distribution type for each synthetic X-grid covariate ('poisson', 'binomial', or 'normal')

dist_covariates_y

Vector specifying distribution type for each synthetic Y-grid covariate ('poisson', 'binomial', or 'normal')

dist_y

Distribution type for synthetic outcome variable (one of 'poisson', 'binomial', or 'normal')

x_intercepts

Intercepts for X covariates

y_intercepts

Intercepts for Y covariates

rho_x

Spatial correlation parameter for X-grid covariates (0 to 1 with higher values yielding more spatial correlation, default = 0.6)

rho_y

Spatial correlation parameter for Y-grid covariates and outcome (0 to 1 with higher values yielding more spatial correlation, default = 0.6)

x_correlation

Between-variable correlation for all pairs of X-grid covariates (default = 0.5)

y_correlation

Between-variable correlation for all pairs of Y-grid covariates (default = 0.5)

beta0_y

Intercept for outcome model

beta_x

Outcome model coefficients for X-grid covariates

beta_y

Outcome model coefficients for Y-grid covariates

diff_pops

Logical, indicating whether the atoms should be generated with different population sizes (diff_pops = TRUE) or a common population size (diff_pops = FALSE)

xy_cov_cor

Logical, indicating whether the atom-level spatial random effects for X-grid and Y-grid covariates should be correlated (xy_cov_cor = TRUE) or not. When set to TRUE, the x_correlation and rho_x parameters are used to generate all covariates (separate correlation parameters are not allowed for X-grid and Y-grid covariates).

Value

List containing gridy, gridx, atoms, and true_params