<- c("sf", "mapview", "spdep", "spatialreg", "tmap", "viridisLite") # note: load spdep first, then spatialreg
pkgs lapply(pkgs, require, character.only = TRUE)
8 Exercise I
Required packages
Session info
sessionInfo()
R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.utf8
[2] LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8
time zone: Europe/London
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] viridisLite_0.4.2 tmap_3.3-3 spatialreg_1.2-9
[4] Matrix_1.5-4.1 spdep_1.2-8 spData_2.2.2
[7] mapview_2.11.0 sf_1.0-13
loaded via a namespace (and not attached):
[1] xfun_0.39 raster_3.6-20 htmlwidgets_1.6.2
[4] lattice_0.21-8 vctrs_0.6.3 tools_4.3.1
[7] crosstalk_1.2.0 LearnBayes_2.15.1 generics_0.1.3
[10] parallel_4.3.1 sandwich_3.0-2 stats4_4.3.1
[13] tibble_3.2.1 proxy_0.4-27 fansi_1.0.4
[16] pkgconfig_2.0.3 KernSmooth_2.23-21 satellite_1.0.4
[19] RColorBrewer_1.1-3 leaflet_2.1.2 webshot_0.5.4
[22] lifecycle_1.0.3 compiler_4.3.1 deldir_1.0-9
[25] munsell_0.5.0 terra_1.7-29 leafsync_0.1.0
[28] codetools_0.2-19 stars_0.6-1 htmltools_0.5.5
[31] class_7.3-22 pillar_1.9.0 MASS_7.3-60
[34] classInt_0.4-9 lwgeom_0.2-13 wk_0.7.3
[37] abind_1.4-5 boot_1.3-28.1 multcomp_1.4-24
[40] nlme_3.1-162 tidyselect_1.2.0 digest_0.6.31
[43] mvtnorm_1.2-2 dplyr_1.1.2 splines_4.3.1
[46] fastmap_1.1.1 grid_4.3.1 colorspace_2.1-0
[49] expm_0.999-7 cli_3.6.1 magrittr_2.0.3
[52] base64enc_0.1-3 dichromat_2.0-0.1 XML_3.99-0.14
[55] survival_3.5-5 utf8_1.2.3 TH.data_1.1-2
[58] leafem_0.2.0 e1071_1.7-13 scales_1.2.1
[61] sp_1.6-1 rmarkdown_2.22 zoo_1.8-12
[64] png_0.1-8 coda_0.19-4 evaluate_0.21
[67] knitr_1.43 tmaptools_3.1-1 s2_1.1.4
[70] rlang_1.1.1 Rcpp_1.0.10 glue_1.6.2
[73] DBI_1.1.3 rstudioapi_0.14 jsonlite_1.8.5
[76] R6_2.5.1 units_0.8-2
Reload data from pervious session
load("_data/msoa2_spatial.RData")
8.1 Environmental inequality
How would you investigate the following descriptive research question: Are ethnic (and immigrant) minorities in London exposed to higher levels of pollution? Also consider the spatial structure. What’s your dependent and whats your independent variable?
1) Define a neigbours weights object of your choice
Assume a typical neighbourhood would be 1.5km in diameter
<- st_centroid(msoa.spdf) coords
Warning: st_centroid assumes attributes are constant over
geometries
# Neighbours within 3km distance
<- dnearneigh(coords, d1 = 0, d2 = 2500)
dist_15.nb
summary(dist_15.nb)
Neighbour list object:
Number of regions: 983
Number of nonzero links: 15266
Percentage nonzero weights: 1.579859
Average number of links: 15.53001
4 regions with no links:
158 463 478 505
Link number distribution:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
4 5 9 23 19 26 36 31 53 39 61 63 59 48 42 35 24 31 28 30 27 26
22 23 24 25 26 27 28 29 30 31 32 33 34
25 19 38 29 32 38 26 16 20 10 8 1 2
5 least connected regions:
160 469 474 597 959 with 1 link
2 most connected regions:
565 567 with 34 links
# There are some mpty one. Lets impute with the nearest neighbour
<- knearneigh(coords, k = 1)
k2.nb
# Replace zero
<- which(card(dist_15.nb) == 0)
nolink_ids card(dist_15.nb) == 0] <- k2.nb$nn[nolink_ids, ]
dist_15.nb[
summary(dist_15.nb)
Neighbour list object:
Number of regions: 983
Number of nonzero links: 15270
Percentage nonzero weights: 1.580273
Average number of links: 15.53408
Link number distribution:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
9 9 23 19 26 36 31 53 39 61 63 59 48 42 35 24 31 28 30 27 26 25
23 24 25 26 27 28 29 30 31 32 33 34
19 38 29 32 38 26 16 20 10 8 1 2
9 least connected regions:
158 160 463 469 474 478 505 597 959 with 1 link
2 most connected regions:
565 567 with 34 links
# listw object with row-normalization
<- nb2listw(dist_15.nb, style = "W") dist_15.lw
2) Estimate the extent of spatial auto-correlation
moran.test(msoa.spdf$no2, listw = dist_15.lw)
Moran I test under randomisation
data: msoa.spdf$no2
weights: dist_15.lw
Moran I statistic standard deviate = 65.197, p-value <
2.2e-16
alternative hypothesis: greater
sample estimates:
Moran I statistic Expectation Variance
0.891520698 -0.001018330 0.000187411
3) Estimate a spatial SAR regression model
- Estimate a spatial autoregressive SAR model
<- lagsarlm(log(no2) ~ per_mixed + per_asian + per_black + per_other
mod_1.sar + per_nonUK_EU + per_nonEU + log(POPDEN),
data = msoa.spdf,
listw = dist_15.lw,
Durbin = FALSE) # we could here extend to SDM
summary(mod_1.sar)
Call:
lagsarlm(formula = log(no2) ~ per_mixed + per_asian + per_black +
per_other + per_nonUK_EU + per_nonEU + log(POPDEN), data = msoa.spdf,
listw = dist_15.lw, Durbin = FALSE)
Residuals:
Min 1Q Median 3Q Max
-0.2140485 -0.0267085 -0.0021421 0.0238337 0.3505513
Type: lag
Coefficients: (asymptotic standard errors)
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.7004e-02 1.8122e-02 -0.9383 0.348110
per_mixed 3.4376e-04 1.4758e-03 0.2329 0.815810
per_asian -8.5205e-05 1.1494e-04 -0.7413 0.458507
per_black -4.2754e-04 2.3468e-04 -1.8218 0.068484
per_other 1.9693e-03 7.4939e-04 2.6279 0.008591
per_nonUK_EU 8.9027e-04 3.9638e-04 2.2460 0.024703
per_nonEU 1.8460e-03 3.5159e-04 5.2506 1.516e-07
log(POPDEN) 1.8650e-02 2.7852e-03 6.6963 2.138e-11
Rho: 0.9684, LR test value: 2002.5, p-value: < 2.22e-16
Asymptotic standard error: 0.0063124
z-value: 153.41, p-value: < 2.22e-16
Wald statistic: 23535, p-value: < 2.22e-16
Log likelihood: 1562.401 for lag model
ML residual variance (sigma squared): 0.0020568, (sigma: 0.045352)
Number of observations: 983
Number of parameters estimated: 10
AIC: -3104.8, (AIC for lm: -1104.3)
LM test for residual autocorrelation
test value: 108.97, p-value: < 2.22e-16
- Have a look into the true multiplier matrix \(({\boldsymbol{\mathbf{I}}_N}-\rho {\boldsymbol{\mathbf{W}}})^{-1}\beta_k\)
<- listw2mat(dist_15.lw)
W <- diag(dim(W)[1])
I
<- unname(mod_1.sar$rho)
rho
<- solve(I - rho*W)
M
1:10, 1:10] M[
1 2 3 4 5
[1,] 1.164650997 0.002433319 0.004089559 0.004034508 0.006545994
[2,] 0.010706605 1.407336301 0.643881932 0.370049927 0.464794934
[3,] 0.011246286 0.402426207 1.474021599 0.429011868 0.641526285
[4,] 0.008875918 0.185024963 0.343209495 1.684533322 0.614086824
[5,] 0.012000989 0.193664556 0.427684190 0.511739020 1.560840834
[6,] 0.010741524 0.192552594 0.452940016 0.631452476 0.672787841
[7,] 0.012779708 0.141953871 0.299247377 0.418234186 0.616895800
[8,] 0.014769006 0.125781189 0.253122442 0.295553039 0.500919513
[9,] 0.011708131 0.147549264 0.309080773 0.568442619 0.629156269
[10,] 0.009937859 0.152900148 0.306652041 0.727001926 0.553973310
6 7 8 9 10
[1,] 0.004882511 0.005808958 0.00872714 0.005854065 0.003613767
[2,] 0.385105188 0.283907742 0.32703109 0.324608380 0.244640236
[3,] 0.566175019 0.374059222 0.41132397 0.424986063 0.306652041
[4,] 0.631452476 0.418234186 0.38421895 0.625286881 0.581601541
[5,] 0.560656534 0.514079833 0.54266281 0.576726579 0.369315540
[6,] 1.571175245 0.558170218 0.46513922 0.661184961 0.543820047
[7,] 0.558170218 1.475511568 0.58520461 0.614170880 0.463886540
[8,] 0.357799398 0.450157392 1.46638195 0.474994894 0.272339890
[9,] 0.601077237 0.558337164 0.56135760 1.581077095 0.517983092
[10,] 0.679775059 0.579858174 0.44255232 0.712226751 1.560083138
- Create an \(N \times N\) effects matrix. What is the effect of unit 6 on unit 10?
# For beta 1
<- mod_1.sar$coefficients
beta
<- beta[2] * M
effM
1:10, 1:10] effM[
1 2 3 4
[1,] 4.003610e-04 8.364789e-07 1.405829e-06 1.386904e-06
[2,] 3.680507e-06 4.837866e-04 2.213411e-04 1.272085e-04
[3,] 3.866028e-06 1.383382e-04 5.067103e-04 1.474773e-04
[4,] 3.051190e-06 6.360427e-05 1.179819e-04 5.790759e-04
[5,] 4.125465e-06 6.657422e-05 1.470209e-04 1.759156e-04
[6,] 3.692511e-06 6.619197e-05 1.557029e-04 2.170684e-04
[7,] 4.393158e-06 4.879813e-05 1.028694e-04 1.437724e-04
[8,] 5.077000e-06 4.323860e-05 8.701349e-05 1.015994e-04
[9,] 4.024792e-06 5.072160e-05 1.062497e-04 1.954081e-04
[10,] 3.416243e-06 5.256102e-05 1.054148e-04 2.499145e-04
5 6 7 8
[1,] 2.250254e-06 1.678414e-06 1.996890e-06 3.000045e-06
[2,] 1.597781e-04 1.323839e-04 9.759625e-05 1.124204e-04
[3,] 2.205314e-04 1.946286e-04 1.285868e-04 1.413969e-04
[4,] 2.110988e-04 2.170684e-04 1.437724e-04 1.320793e-04
[5,] 5.365554e-04 1.927315e-04 1.767203e-04 1.865460e-04
[6,] 2.312779e-04 5.401079e-04 1.918768e-04 1.598965e-04
[7,] 2.120644e-04 1.918768e-04 5.072225e-04 2.011702e-04
[8,] 1.721963e-04 1.229973e-04 1.547463e-04 5.040841e-04
[9,] 2.162790e-04 2.066266e-04 1.919342e-04 1.929725e-04
[10,] 1.904341e-04 2.336798e-04 1.993323e-04 1.521320e-04
9 10
[1,] 2.012396e-06 1.242270e-06
[2,] 1.115875e-04 8.409764e-05
[3,] 1.460934e-04 1.054148e-04
[4,] 2.149489e-04 1.999316e-04
[5,] 1.982558e-04 1.269561e-04
[6,] 2.272892e-04 1.869438e-04
[7,] 2.111277e-04 1.594658e-04
[8,] 1.632845e-04 9.361968e-05
[9,] 5.435118e-04 1.780621e-04
[10,] 2.448354e-04 5.362949e-04
# "Effect" of unit 6 on unit 10
10, 6] effM[
6
0.0002336798
- Calculate and interpret the summary impact measures.
<- impacts(mod_1.sar, listw = dist_15.lw, R = 300)
mod_1.sar.imp summary(mod_1.sar.imp)
Impact measures (lag, exact):
Direct Indirect Total
per_mixed 0.0004939013 0.010385844 0.010879745
per_asian -0.0001224192 -0.002574253 -0.002696672
per_black -0.0006142789 -0.012917166 -0.013531445
per_other 0.0028294759 0.059498722 0.062328198
per_nonUK_EU 0.0012791011 0.026897166 0.028176267
per_nonEU 0.0026523198 0.055773451 0.058425770
log(POPDEN) 0.0267960076 0.563471199 0.590267206
========================================================
Simulation results ( variance matrix):
Direct:
Iterations = 1:300
Thinning interval = 1
Number of chains = 1
Sample size per chain = 300
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
per_mixed 0.0006040 0.0020920 1.208e-04 1.208e-04
per_asian -0.0001103 0.0001560 9.007e-06 1.035e-05
per_black -0.0006271 0.0003298 1.904e-05 1.904e-05
per_other 0.0027658 0.0010967 6.332e-05 6.332e-05
per_nonUK_EU 0.0012667 0.0005646 3.260e-05 3.260e-05
per_nonEU 0.0026330 0.0004942 2.853e-05 3.103e-05
log(POPDEN) 0.0268165 0.0037907 2.189e-04 2.653e-04
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
per_mixed -3.555e-03 -0.0006939 0.0006818 2.010e-03 4.614e-03
per_asian -4.052e-04 -0.0002075 -0.0001140 -1.541e-05 1.825e-04
per_black -1.223e-03 -0.0008542 -0.0006202 -4.248e-04 4.855e-07
per_other 5.201e-04 0.0020690 0.0027611 3.533e-03 4.875e-03
per_nonUK_EU 6.608e-05 0.0008741 0.0012604 1.684e-03 2.267e-03
per_nonEU 1.644e-03 0.0023112 0.0026410 2.932e-03 3.589e-03
log(POPDEN) 2.006e-02 0.0242190 0.0268230 2.966e-02 3.408e-02
========================================================
Indirect:
Iterations = 1:300
Thinning interval = 1
Number of chains = 1
Sample size per chain = 300
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
per_mixed 0.012078 0.045516 0.0026279 0.0026279
per_asian -0.002418 0.003570 0.0002061 0.0002369
per_black -0.013473 0.007693 0.0004441 0.0004602
per_other 0.059114 0.025766 0.0014876 0.0014876
per_nonUK_EU 0.027127 0.013340 0.0007702 0.0007702
per_nonEU 0.056823 0.016381 0.0009458 0.0010557
log(POPDEN) 0.573386 0.120109 0.0069345 0.0080970
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
per_mixed -0.075534 -0.014592 0.013192 0.0405609 1.008e-01
per_asian -0.010291 -0.004422 -0.002205 -0.0003546 3.968e-03
per_black -0.030384 -0.018381 -0.012752 -0.0086602 1.556e-05
per_other 0.012396 0.043711 0.057869 0.0716045 1.180e-01
per_nonUK_EU 0.001391 0.017889 0.027007 0.0347052 5.276e-02
per_nonEU 0.033511 0.046229 0.053537 0.0655623 9.040e-02
log(POPDEN) 0.394650 0.485552 0.557876 0.6467093 8.672e-01
========================================================
Total:
Iterations = 1:300
Thinning interval = 1
Number of chains = 1
Sample size per chain = 300
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
per_mixed 0.012682 0.047576 0.0027468 0.0027468
per_asian -0.002529 0.003722 0.0002149 0.0002471
per_black -0.014100 0.007999 0.0004618 0.0004778
per_other 0.061880 0.026737 0.0015437 0.0015437
per_nonUK_EU 0.028394 0.013849 0.0007996 0.0007996
per_nonEU 0.059456 0.016742 0.0009666 0.0010819
log(POPDEN) 0.600202 0.121985 0.0070428 0.0082708
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
per_mixed -0.079021 -0.01537 0.013886 0.042492 1.060e-01
per_asian -0.010702 -0.00465 -0.002321 -0.000369 4.168e-03
per_black -0.031483 -0.01929 -0.013426 -0.009016 1.604e-05
per_other 0.012907 0.04572 0.060653 0.074784 1.228e-01
per_nonUK_EU 0.001457 0.01881 0.028109 0.036437 5.446e-02
per_nonEU 0.035310 0.04858 0.056231 0.068118 9.358e-02
log(POPDEN) 0.416004 0.51273 0.584256 0.672491 9.010e-01
4) Is SAR the right model choice or would you rather estimate a different model?
- How do results change once you specify a spatial Durbin model?
I am using a spatial Durbin error model here.
<- errorsarlm(log(no2) ~ per_mixed + per_asian + per_black + per_other
mod_1.durb + per_nonUK_EU + per_nonEU + log(POPDEN),
data = msoa.spdf,
listw = dist_15.lw,
Durbin = TRUE)
summary(mod_1.durb)
Call:
errorsarlm(formula = log(no2) ~ per_mixed + per_asian + per_black +
per_other + per_nonUK_EU + per_nonEU + log(POPDEN), data = msoa.spdf,
listw = dist_15.lw, Durbin = TRUE)
Residuals:
Min 1Q Median 3Q Max
-0.1839285 -0.0254426 -0.0027042 0.0216084 0.2944840
Type: error
Coefficients: (asymptotic standard errors)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.64939215 0.24748370 10.7053 < 2.2e-16
per_mixed 0.00553333 0.00223688 2.4737 0.013373
per_asian -0.00017156 0.00024183 -0.7094 0.478062
per_black -0.00057947 0.00034426 -1.6832 0.092334
per_other 0.00203392 0.00112534 1.8074 0.070701
per_nonUK_EU 0.00086254 0.00058902 1.4644 0.143091
per_nonEU 0.00135822 0.00053480 2.5397 0.011096
log(POPDEN) 0.02716824 0.00354239 7.6695 1.732e-14
lag.per_mixed 0.00107140 0.00819322 0.1308 0.895960
lag.per_asian -0.00060616 0.00070080 -0.8649 0.387069
lag.per_black -0.00191733 0.00130997 -1.4636 0.143291
lag.per_other 0.01014125 0.00496979 2.0406 0.041293
lag.per_nonUK_EU 0.00925620 0.00217624 4.2533 2.106e-05
lag.per_nonEU 0.00563564 0.00185541 3.0374 0.002386
lag.log(POPDEN) -0.01370891 0.01128957 -1.2143 0.224634
Lambda: 0.99424, LR test value: 1527.8, p-value: < 2.22e-16
Asymptotic standard error: 0.0024921
z-value: 398.96, p-value: < 2.22e-16
Wald statistic: 159170, p-value: < 2.22e-16
Log likelihood: 1572.051 for error model
ML residual variance (sigma squared): 0.0019546, (sigma: 0.044211)
Number of observations: 983
Number of parameters estimated: 17
AIC: NA (not available for weighted model), (AIC for lm: -1584.3)
<- impacts(mod_1.durb, listw = dist_15.lw, R = 300)
mod_1.durb.imp summary(mod_1.durb.imp, zstats = TRUE, short = TRUE)
Impact measures (SDEM, glht, n):
Direct Indirect Total
per_mixed 0.0055333298 0.0010713981 0.0066047279
per_asian -0.0001715584 -0.0006061567 -0.0007777151
per_black -0.0005794704 -0.0019173261 -0.0024967965
per_other 0.0020339226 0.0101412504 0.0121751729
per_nonUK_EU 0.0008625423 0.0092561971 0.0101187394
per_nonEU 0.0013582217 0.0056356426 0.0069938643
log(POPDEN) 0.0271682392 -0.0137089088 0.0134593305
========================================================
Standard errors:
Direct Indirect Total
per_mixed 0.0022368774 0.0081932151 0.0089412446
per_asian 0.0002418282 0.0007008041 0.0007540247
per_black 0.0003442649 0.0013099673 0.0013639632
per_other 0.0011253352 0.0049697880 0.0051074709
per_nonUK_EU 0.0005890159 0.0021762395 0.0022231333
per_nonEU 0.0005348024 0.0018554128 0.0020196122
log(POPDEN) 0.0035423870 0.0112895670 0.0132006518
========================================================
Z-values:
Direct Indirect Total
per_mixed 2.4736849 0.1307665 0.738681
per_asian -0.7094226 -0.8649446 -1.031419
per_black -1.6832108 -1.4636442 -1.830545
per_other 1.8073926 2.0405801 2.383797
per_nonUK_EU 1.4643785 4.2532989 4.551567
per_nonEU 2.5396703 3.0374063 3.462974
log(POPDEN) 7.6694724 -1.2142989 1.019596
p-values:
Direct Indirect Total
per_mixed 0.013373 0.8959600 0.46010070
per_asian 0.478062 0.3870692 0.30234457
per_black 0.092334 0.1432912 0.06716843
per_other 0.070701 0.0412926 0.01713506
per_nonUK_EU 0.143091 2.1064e-05 5.3248e-06
per_nonEU 0.011096 0.0023862 0.00053424
log(POPDEN) 1.7319e-14 0.2246336 0.30792015