Function for determining the optimal spatial data discretization based on SPADE q-statistics.
Usage
cpsd_disc(
formula,
data,
wt,
discnum = 3:8,
discmethod = "quantile",
strategy = 2L,
increase_rate = 0.05,
cores = 1,
return_disc = TRUE,
seed = 123456789,
...
)
Arguments
- formula
A formula of optimal spatial data discretization.
- data
A data.frame or tibble of observation data.
- wt
The spatial weight matrix.
- discnum
(optional) A vector of number of classes for discretization. Default is
3:8
.- discmethod
(optional) The discretization methods. Default all use
quantile
. Noted thatrobust
will userobust_disc()
;rpart
will userpart_disc()
; Others usesdsfun::discretize_vector()
.- strategy
(optional) Discretization strategy. When
strategy
is1L
, choose the highest SPADE model q-statistics to determinate optimal spatial data discretization parameters. Whenstrategy
is2L
, The optimal discrete parameters of spatial data are selected by combining LOESS model.- increase_rate
(optional) The critical increase rate of the number of discretization. Default is
5%
.- cores
(optional) A positive integer(default is 1). If cores > 1, a 'parallel' package cluster with that many cores is created and used. You can also supply a cluster object.
- return_disc
(optional) Whether or not return discretized result used the optimal parameter. Default is
TRUE
.- seed
(optional) Random seed number, default is
123456789
.Setting random seed is useful when the sample size is greater than3000
(the default value forlargeN
) and the data is discretized by sampling10%
(the default value forsamp_prop
inst_unidisc()
).- ...
(optional) Other arguments passed to
st_unidisc()
,robust_disc()
orrpart_disc()
.
Value
A list with the optimal parameter in the provided parameter combination with k
,
method
and disc
(when return_disc
is TRUE
).
x
discretization variable name
k
optimal number of spatial data discreteization
method
optimal spatial data discretization method
disc
the result of optimal spatial data discretization
Note
When the discmethod
is configured to robust
, it will operate at a significantly reduced speed.
Consequently, the use of robust discretization is not advised.
References
Yongze Song & Peng Wu (2021) An interactive detector for spatial associations, International Journal of Geographical Information Science, 35:8, 1676-1701, DOI:10.1080/13658816.2021.1882680
Author
Wenbo Lv lyu.geosocial@gmail.com
Examples
data('sim')
wt = sdsfun::inverse_distance_swm(sf::st_as_sf(sim,coords = c('lo','la')))
cpsd_disc(y ~ xa + xb + xc,
data = sim,
wt = wt)
#> $x
#> [1] "xa" "xb" "xc"
#>
#> $k
#> [1] 5 5 7
#>
#> $method
#> [1] "quantile" "quantile" "quantile"
#>
#> $disv
#> # A tibble: 80 × 3
#> xa xb xc
#> <int> <int> <int>
#> 1 1 4 3
#> 2 4 4 6
#> 3 2 5 3
#> 4 1 3 2
#> 5 4 4 5
#> 6 2 4 4
#> 7 4 4 6
#> 8 1 2 2
#> 9 4 3 4
#> 10 5 1 6
#> # ℹ 70 more rows
#>