Skip to contents

Model principles

Let us begin by defining two spatial cross-sectional series \(\{x_s\}_{s \in S}\) and \(\{y_s\}_{s \in S}\), where \(S\) represents the study area.

We use:

\[ \mathcal{X}_W = \{ W_i x \mid W_i \in \mathcal{W}(x, y) \} \]

\[ \mathcal{Y}_W = \{ W_i y \mid W_i \in \mathcal{W}(x, y) \} \]

to denote the sets of spatial lags of \(x\) and \(y\) given by all the weighting matrices in \(\mathcal{W}(x, y)\)(that is the set of spatial dependence structures between \(x\) and \(y\)).

We say that \(\{x_s\}_{s \in S}\) does not cause \(\{y_s\}_{s \in S}\) under the spatial structures \(\mathcal{X}_W\) and \(\mathcal{Y}_W\) if

\[ h_{y |\mathcal{Y}_W}(m) = h_{y | \mathcal{Y}_W,\mathcal{X}_W}(m) \] A unilateral non-parametric test can be applied to assess the spatial causality via the following null hypothesis:

\[ H_0 : \{x_s\}_{s \in S} \text{ does not cause } \{y_s\}_{s \in S} \text{ under the spatial structures } \mathcal{X}_W \text{ and } \mathcal{Y}_W, \]

with the following statistic:

\[ \hat{\delta}(\mathcal{Y}_W, \mathcal{X}_W) = \hat{h}_{y |\mathcal{Y}_W}(m) - \hat{h}_{y | \mathcal{Y}_W,\mathcal{X}_W}(m) \]

where \(\hat{h}_*(m)\) is the estimated conditional symbolic entropy using Shannon’s entropy with \(m-1\) nearest neighbors. The alternative is that the null hypothesis of is not true.

If \(\mathcal{X}_W\) does not contain extra information about \(y\) then \(\hat{\delta}(\mathcal{Y}_W, \mathcal{X}_W) = 0\), otherwise, $ (_W, _W) > 0$.

\(h_{y |\mathcal{Y}_W}(m)\) measures the uncertainty of the distribution of symbols of \(y\), conditional to the symbols of its spatial lag, $ _W$. Moreover, \(h_{y | \mathcal{Y}_W,\mathcal{X}_W}(m)\) measures the uncertainty of the distribution of symbols of \(y\), conditional to the symbols of the spatial lags of \(y\), \(\mathcal{Y}_W\), and of \(x\), $ _W$. If the second variable, \(x\), indeed causes the first one then there should be a significant decrease in the entropy, and the statistic \(\hat{\delta}(\mathcal{Y}_W, \mathcal{X}_W)\) will take on high positive values. If there is only a spatial correlation, but not causation, the difference between both entropies will be small. The statistical significance of \(\hat{\delta}(\mathcal{Y}_W, \mathcal{X}_W)\) is assessed using spatial block bootstrap.

The Spatial (Granger) Causality Test (SCT) is not well-suited for spatial field data; instead, it is more appropriate for spatial areal data, where observations are aggregated over distinct spatial units. As illustrated in the following example, spatial areal data yielded a reasonable spatial causality statistic, whereas raster-based data produced a negative value—indicating no causality—despite the fact that a causal association should exist.

Usage examples

An example of spatial lattice data

Load the spEDM package and the columbus OH dataset:

library(spEDM)

columbus = sf::read_sf(system.file("case/columbus.gpkg", package="spEDM"))
columbus
## Simple feature collection with 49 features and 6 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 5.874907 ymin: 10.78863 xmax: 11.28742 ymax: 14.74245
## Projected CRS: Undefined Cartesian SRS with unknown unit
## # A tibble: 49 × 7
##    hoval   inc  crime  open plumb discbd                                    geom
##    <dbl> <dbl>  <dbl> <dbl> <dbl>  <dbl>                               <POLYGON>
##  1  80.5 19.5  15.7   2.85  0.217   5.03 ((8.624129 14.23698, 8.5597 14.74245, …
##  2  44.6 21.2  18.8   5.30  0.321   4.27 ((8.25279 14.23694, 8.282758 14.22994,…
##  3  26.4 16.0  30.6   4.53  0.374   3.89 ((8.653305 14.00809, 8.81814 14.00205,…
##  4  33.2  4.48 32.4   0.394 1.19    3.7  ((8.459499 13.82035, 8.473408 13.83227…
##  5  23.2 11.3  50.7   0.406 0.625   2.83 ((8.685274 13.63952, 8.677577 13.72221…
##  6  28.8 16.0  26.1   0.563 0.254   3.78 ((9.401384 13.5504, 9.434411 13.69427,…
##  7  75    8.44  0.178 0     2.40    2.74 ((8.037741 13.60752, 8.062716 13.60452…
##  8  37.1 11.3  38.4   3.48  2.74    2.89 ((8.247527 13.58651, 8.2795 13.5965, 8…
##  9  52.6 17.6  30.5   0.527 0.891   3.17 ((9.333297 13.27242, 9.671007 13.27361…
## 10  96.4 13.6  34.0   1.55  0.558   4.33 ((10.08251 13.03377, 10.0925 13.05275,…
## # ℹ 39 more rows

Use SCT to detect spatial causality among the variables inc, crime, and hoval :

# house value and crime
sc.test(columbus, "hoval", "crime", k = 15)
## spatial (granger) causality test
## hoval -> crime: statistic = 1.101, p value = 0.526
## crime -> hoval: statistic = 1.484, p value = 0.023

# household income and crime
sc.test(columbus, "inc", "crime", k = 15)
## spatial (granger) causality test
## inc -> crime: statistic = 1.116, p value = 0.449
## crime -> inc: statistic = 0.848, p value = 0.845

# household income and house value
sc.test(columbus, "inc", "hoval", k = 15)
## spatial (granger) causality test
## inc -> hoval: statistic = 1.517, p value = 0.035
## hoval -> inc: statistic = 0.851, p value = 0.922

An example of spatial grid data

Load the spEDM package and its farmland NPP data:

library(spEDM)

npp = terra::rast(system.file("case/npp.tif", package = "spEDM"))
# To save the computation time, we will aggregate the data by 3 times
npp = terra::aggregate(npp, fact = 3, na.rm = TRUE)
npp
## class       : SpatRaster 
## dimensions  : 135, 161, 4  (nrow, ncol, nlyr)
## resolution  : 30000, 30000  (x, y)
## extent      : -2625763, 2204237, 1867078, 5917078  (xmin, xmax, ymin, ymax)
## coord. ref. : CGCS2000_Albers 
## source(s)   : memory
## names       :      npp,        pre,      tem,      elev 
## min values  :   187.50,   390.3351, -47.8194, -110.1494 
## max values  : 15381.89, 23734.5330, 262.8576, 5217.6431

Use SCT to detect spatial causality among the variables pre, tem, and npp :

# precipitation and npp
sc.test(npp,"pre","npp",k = 270)
## spatial (granger) causality test
## pre -> npp: statistic = -3.625, p value = 0
## npp -> pre: statistic = -3.226, p value = 0

# temperature and npp
sc.test(npp,"tem","npp",k = 270)
## spatial (granger) causality test
## tem -> npp: statistic = -3.625, p value = 0
## npp -> tem: statistic = -3.752, p value = 0

# precipitation and temperature
sc.test(npp,"pre","tem",k = 270)
## spatial (granger) causality test
## pre -> tem: statistic = -3.753, p value = 0
## tem -> pre: statistic = -3.227, p value = 0