[R-sig-Geo] question on the difference between spdep function spautolm() and glm() with autocovariate

Li, Han Han_Li at baylor.edu
Wed Feb 13 01:47:03 CET 2013


Hey Roger, thank you very much for your explanation. I found out from R help that errorsarlm() and spautolm(..., family="SAR") do the same analysis. So I was afraid this (two functions using the same model and doing the same analysis) might happen somewhere else. I never took any regression analysis or spatial statistic class. In general I have been using your book Applied Spatial Data Analysis with R as the guide.

I have this following question:
You mentioned that the model for glm()+auto is  y = rho W y + Xb + e
Based on what I know, this is also the model for function lagsarlm()
So do glm()+auto and lagsarlm() do the same/similar analysis?

The second question is:
If I use several different models, for example here, I used CAR, SAR, glm()+auto, and lagsarlm(), is it wise to use AIC to decide which model fits my data better? Based on your previous explanation, I understand now glm()+auto will cause problems to interpret the coefficients. But how about CAR and SAR, or lagsarlm(), assuming it is different from glm()+auto?

Also after trying all these models (SAR, CAR, glm()+auto, and lagsarlm()), if some variables are significant in all of these models, can I interpret that these variables are more important than other variables which are only significant in one or two models?
For example, variable 1 is significant in all 4 models for predicting the presence of bats; variable 2 is only significant in SAR. Can I say that variable 1 is more important than variable 2? I would like to point out that my research is about bats in urban environment. There is very little biological information to help me interpret the results.

Thanks again.

Han



Han Li
Ph.D. Candidate
Department of Biology
Baylor University
Waco, TX  76798-7388
Phone: (254) 710-2151
Fax: (254) 710-2969
han_li at baylor.edu<mailto:han_li at baylor.edu>




________________________________________
From: Roger Bivand [Roger.Bivand at nhh.no]
Sent: Tuesday, February 12, 2013 1:05 AM
To: Li, Han
Cc: r-sig-geo at r-project.org
Subject: Re: [R-sig-Geo] question on the difference between spdep function spautolm() and glm() with autocovariate

On Tue, 12 Feb 2013, Li, Han wrote:

> Dear list,
>
> I am currently working on spatial autoregression modeling for my
> dissertation research. I want to use regression models to identify
> socioeconomic/landscape variables (15 total, var1$bat_survey -
> var15$bay_survey) that can affect the presence/absence of bats
> (p/a$bat_survey). Since spatial autocorrelation exists in my P/A data, I
> tried different spatial models.
>
> My question is:
>
> If I use the same way (same neighboring criteria, same weight style) to
> define neighbors, and build model #1 spatial simultaneous autoregression
> model (SAR) by function spautolm(), and model #2 glm() with
> autocovariate generated by function autocov_dist(), should I expect the
> same result, or not?

No, because obviously they are different models:

spautolm: y = Xb + u, u = lambda W u + e, e ~ N(0, sigma2 I)

glm+auto: y = rho W y + Xb + e

Interpreting the latter is subject to great difficulty (see ?impacts)
because the DGP is:

(I - rho W) y = Xb + e,

y = (I - rho W)^{-1} (Xb + e)

so the b coefficients cannot be interpreted directly. In addition, the glm
estimate of rho is biassed because it is not constrained to its feasible
range (so that (I - rho W) can be inverted).

Using geo-additive models, ML in the spautolm case, or others, is easier
to handle because the fitted coefficients don't interact. It is only in
some settings that the observations interact with each other directly,
more often the autocorrelation is in the residuals.

Hope this clarifies,

Roger

>
> I understood that if I use glm() with autocovariate it will include one
> more variable (the autocovariate) in the result. I also learned that
> glm() is more a predicting model and spautolm() is more an explanatory
> model. But I am not sure whether the significant variables selected by
> these two models will be same.
>
>
> ##r code example##
> model_1 <- spautolm (p/a ~ var1 + var2 + ... + var15, data = bat_survey,
> listw = neighbor_regime1, family="SAR")
> ####
> autocov_model_2 <- autocov_dist (p/a$bat_survey, xy = coords, style =
> "W", type = "one")
> model_2 <- glm (p/a ~ var1 + var2 + ... + var15 + autocov_model_2,
> family = "binomial", data = bat_survey)
>
> Thanks in advance. Your insight will be deeply appreciated.
>
> Han
>
>
> Han Li
> Department of Biology
> Baylor University
> Waco, TX 76798-7388
> Phone: (254) 710-2151
> Fax: (254) 710-2969
> han_li at baylor.edu<mailto:han_li at baylor.edu>
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

--
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list