Skip to contents

1. The principle of the information consistency-based measures

\[ I_{N}\left(d,s\right) = \frac{I \left(d,s\right)}{I \left(d\right)} = \frac{I \left(d\right) - I \left(d \mid s\right)}{I \left(d\right)} = 1 - \frac{\sum_{s_i \in S}\sum_{x \in V_d} p\left(s_i,x\right) \log p\left(x \mid s_i\right)}{\sum_{x \in V_d} p\left(x\right) \log p\left(x\right)} \]

where \(p\left(x\right)\) is the probability of observing \(x\) in \(U\), \(p\left(s_i,x\right)\) is the probability of observing \(s_i\) and \(x\) in \(U\), and \(p\left(x \mid s_i\right)\) is the probability of observing \(x\) given that the stratum is \(s_i\).

2. Example

install.packages("itmsa", dep = TRUE)
install.packages("gdverse", dep = TRUE)
ntds = gdverse::NTDs
ntds$incidence = sdsfun::discretize_vector(ntds$incidence, 5)
itm(incidence ~ watershed + elevation + soiltype,
    data = ntds, method = "icm")
## # A tibble: 3 × 3
##   Variable     Iv    Pv
##   <chr>     <dbl> <dbl>
## 1 watershed 0.445     0
## 2 elevation 0.390     0
## 3 soiltype  0.210     0