Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong calculation of impacts in SLX #23

Closed
nk027 opened this issue Nov 27, 2021 · 0 comments
Closed

Wrong calculation of impacts in SLX #23

nk027 opened this issue Nov 27, 2021 · 0 comments

Comments

@nk027
Copy link
Contributor

nk027 commented Nov 27, 2021

I was reproducing some results to compare spatialreg (version 1.2-1) and bsreg and stumbled upon an issue with the calculation of impacts for the SLX term (i.e. impacting SLX, SDM, and SDEM models).
For the SLX, average indirect impacts are often assumed to be reflected by θ directly. However, this is not generally the case. Partial effects of a univariate SLX are given by

∂y/∂x = Iβ + Wθ

The average indirect effect (i.e. observation i on j != i) is given by sum(W) / N * theta. For row-stochastic W we have sum(W) == N, so the impact is given by θ, but not for other scaling types. I am currently wrapping up a working paper, where I go into some more detail, that I'll link here later.

The issue is that this is not accounted for in the impacts() methods for models with the LX term, as I demonstrate below. I think this can be fixed in a straightforward manner and I can start work on a PR, but will need some time. A code example of what I am talking about is below.

library("spatialreg")
set.seed(42)
N <- 100L

# Build connectivity ---
xy <- cbind(runif(N), runif(N)) # Simulate locations
dist <- as.matrix(dist(xy)) # Compute distance
diag(dist) <- Inf # We want zeros on the diagonal
W <- dist^-2 # Compute inverse-distance with decay parameter two
W_rs <- W / rowSums(W) # Use row-, eigenvalue-, and norm-scaling
W_ev <- W / max(eigen(W, symmetric = TRUE, only.values = TRUE)$values)
W_fn <- W / norm(W, "F")

# Build DGP ---
X <- matrix(rnorm(N), N)
beta <- 2
theta <- 1
y <- X * beta + W_ev %*% X * theta + rnorm(N, sd = 0.01) # No need for noise

# Estimate SLX (using lm and spatialreg) ---

# Use lm and compare theta
(m1 <- lm(y ~ 0 + X + W_ev %*% X)) # True model -- correct
(m2 <- lm(y ~ 0 + X + W_rs %*% X)) # Appears way off
(m3 <- lm(y ~ 0 + X + W_fn %*% X))
# Results mirror spatialreg
(s1 <- lmSLX(y ~ 0 + X, listw = spdep::mat2listw(W_ev)))
(s2 <- lmSLX(y ~ 0 + X, listw = spdep::mat2listw(W_rs)))
(s3 <- lmSLX(y ~ 0 + X, listw = spdep::mat2listw(W_fn)))
# The thetas are very different, depending on W's scale

# Compute average effects
c("Direct" = coef(m1)[[1]], "Indirect" = sum(W_ev) / N * coef(m1)[[2]])
c("Direct" = coef(m2)[[1]], "Indirect" = sum(W_rs) / N * coef(m2)[[2]])
c("Direct" = coef(m3)[[1]], "Indirect" = sum(W_fn) / N * coef(m3)[[2]])
# W_ev and W_fn are equivalent since they only differ by scaling -- W_rs is off
# since it's not symmetric, but effects are closer than theta would imply.

# Compute average effects using spatialreg
impacts(s1)
impacts(s2)
impacts(s3)
# The package just uses the coefficients and ignores the strucutre of W

# Check against true model ---
c("Direct" = beta, "Indirect" = sum(W_ev) / N * theta,
  "Direct (empirical)" = mean(((X + 1) * beta + W_ev %*% X %*% theta) - y),
  "Indirect (empirical)" = mean((X * beta + W_ev %*% (X + 1) %*% theta) - y))
# Partial effects computed manually seem valid
nk027 pushed a commit to nk027/spatialreg that referenced this issue Dec 17, 2021
As documented in issue r-spatial#23 and doi.org/10.13140/RG.2.2.11269.68328
the average indirect effects of the SLX model are not correct for
non-row-stochastic matrices. This commit fixes that.

To be maximally non-invasive to users, I apply the necessary scale
factor to the coefficients of the underlying LM model. This means
that the coefficients *can* be directly interpreted as partial
effects -- which is probably the way it's being done anyways.

This could also be addressed in a more involved manner (for users
and coders) -- we could apply the scale factor only to the indirect
impacts (returned by 'impacts()') and only adjust the model used to
compute total effects. I did not pursue this as it (1) would require
more code, (2) be more contrived, and (3) does not reflect usage
patterns that I would expect.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants