[R-sig-Geo] dnearneigh, knearneigh and SAR - not removing spatial autocorrelation

Wed Apr 27 22:39:10 CEST 2011

Hi Roger,

Thanks for this, as previously mentioned i am relatively new to this, i started with a GLS method but then realised it may not be the most effective so i decided to try SAR methods.

> What definition of neighbour are you using in the correlogram?

I call the SAR as follows :- sem.nb1.5.w<-errorsarlm(ols2, listw=nb1.5.w, zero.policy=F, na.action=na.omit)

Then extract the residuals: - res.sem.nb1.5.w <- residuals(sem.nb1.5.w)

Then use these to construct the correlogram: - cor.sem.nb1.5.w<-correlog(data$X, data$Y, z=residuals(sem.nb1.5.w), na.rm=T, increment=1, resamp=1)

> The correlogram obviously doesn't care about no-neighbour observations, so perhaps do the same?

How do i do this? If i set the dmax too low i get "Empty neighbour sets found" so it wont run the model. I feel i am artificially setting the dmax based on the fact it works rather than any scientific rationale. 

I tried a method used in your example:

> coords <- coordinates(columbus)
> rn <- sapply(slot(columbus, "polygons"), function(x) slot(x, "ID"))
> k1 <- knn2nb(knearneigh(coords))
> all.linked <- max(unlist(nbdists(k1, coords)))
> col.nb.0.all <- dnearneigh(coords, 0, all.linked, row.names=rn)

To join everything on a NN basis. But that resulted in little difference in the correlgram and AIC.

So my two questions are:

Do i have to link every neighbour and if not how to i run the model without getting the error and how do i decide which neighbours not to link?

Will it make a large difference if the correlgram doesn't change much? Does this mean the model is not "dealing" with the correlation effectively and how do i improve this?

Thanks again and sorry for my relatively naive questions!

Chris

On 27 Apr 2011, at 21:16, Roger Bivand wrote:

On Wed, 27 Apr 2011, Chris Mcowen wrote:

> Dear list,
> 
> I was wondering if somebody could possibly offer me some advice?
> 
> I have a dataset where there is spatial autocorrelation present - visible from the correlogram. So i tried to remove it using a SAR. My data is for 458 regions of differing size for which i have long - lat co-ordinates.
> 
> 
> coords<-cbind(data$X,data$Y)
> coords<-as.matrix(coords)
> 
> The first approach was to use dnearneigh to set up the neighbourhood. I am very new to this and was having problems as regions would often appear with no links ( see below) so i upped dmax until this no longer occurred - this maybe a incorrect method?
> 
>> dnearneigh(coords, 0, 1500, row.names = NULL, longlat = TRUE)
> Neighbour list object:
> Number of regions: 458
> Number of nonzero links: 8990
> Percentage nonzero weights: 4.285769
> Average number of links: 19.62882
> 2 regions with no links:
> 235 236
> 
> Results in - Empty neighbour sets found
> 
>> dnearneigh(coords, 0, 2000, row.names = NULL, longlat = TRUE)
> Neighbour list object:
> Number of regions: 458
> Number of nonzero links: 14200
> Percentage nonzero weights: 6.769512
> Average number of links: 31.00437
> 
> I then converted this to a weight matrix and used in my SAR.
> 
> nb1.5 <- dnearneigh(coords, 0, 2000, row.names = NULL, longlat = TRUE)
> nb1.5.w<-nb2listw(nb1.5, glist=NULL, style="W", zero.policy=FALSE)
> 
> However, looking at the correlogram and the AIC ( below) it seems to not have made a huge difference
> 
> AIC: -2581.8, (AIC for lm: -2574)

With relatively large numbers of neighbours, you smooth more. What definition of neighbour are you using in the correlogram? Why not just use the same? The correlogram obviously doesn't care about no-neighbour observations, so perhaps do the same? If you want to link every observation in, why not then down-weight distant neighbours using inverse distance weights - see ?nbdists.

The test neighbour definition and the weights used for model fitting do not match. Further, you don't say whether your correlogram is for the OLS model residuals or just the response variable. If the latter, the explanatory variables may co-vary in space with the response, so the residuals are in fact not spatially autocorrelated.

Hope this helps,

Roger

> 
> So i tried defining my neighbourhood using the knearneigh function but that made very little difference
> 
> test <- knearneigh(coords, k=1, longlat = NULL, RANN=TRUE)
> knn2nb(test, row.names = NULL, sym = FALSE)
> k1 <- knn2nb(knearneigh(coords))
> all.linked <- max(unlist(nbdists(k1, coords)))
> col.nb.0.all <- dnearneigh(coords, 0, all.linked, row.names=rn)
> 
> AIC: -2581.8, (AIC for lm: -2574)
> 
> IS there something i am doing wrong or is there a step i am not doing?
> 
> Any help would be gratefully received.
> 
> Chris
> 
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo