[R-sig-Geo] Removing spatial autocorrelation - Memory limits

Roger Bivand Roger.Bivand at nhh.no
Mon Dec 12 11:50:55 CET 2011


On Mon, 12 Dec 2011, Chris Mcowen wrote:

> Dear List,
>
> I am trying to model variation is fisheries catch, I have a large data 
> set (36574) cells , with knowledge of the tonnes of fish caught in each 
> "Cell" ( .5 degree) global cells.
>
> At present I am simply wanting to investigate if certain "area" (clusters of
> cells) have high numbers of fish than others
>
> Due to the grid nature of the data set I have significant spatial
> autocorrelation in the data set.
>
> I have tried:
>
> REALM_gls <- gls(tonnesperkm_log~REALM, correlation = corGaus(form =~Lat +
> Lon), data = Cells)
>
> coords<-cbind(Cells$Lat,Cells$Lon)
> coords<-as.matrix(coords)
> nb1.5<-dnearneigh(coords,0,1.5)
> nb1.5.w<-nb2listw(nb1.5, glist=NULL, style="W", zero.policy=TRUE)
> ols_lm<-lm(Cells$tonnesperkm_log~Cells$REALM)
> ols_lm_error_REALM<-errorsarlm(ols_lm, listw=nb1.5.w, na.action = na.omit,
> zero.policy = T, data = Cells)

This is not right, the first argument to errorsarlm() is a formula object. 
Did you look at the help page? If you did, did you look at the method= 
argument, which offers among others sparse matrix methods for larger N?
You may have no-neighbour observations, just see the help page. In 
addition, you have geographical coordinates, so should set the arguments 
to dnearneigh appropriately - here they are assuming a planar surface, but 
may be OK to recover neighbours.

> llk1 <- knn2nb(knearneigh(coords, k=1, longlat=FALSE))
> col.nb.0.all <- dnearneigh(coords, 0, llk1)

Wrong function argument again, llk1 is an nb object, not a scalar 
distance. Again, please do read the help pages carefully. You need to find 
the maximum of the first nearest neighbour distances, but I advise against 
this here.

Advice, please do read the help pages carefully, and relevant literature, 
for example ASDAR as as www.asdar-book.org.

Roger


> col.nb.0.all
> summary(col.nb.0.all)
> ols_error_REALM2<-errorsarlm(ols_lm, listw=col.nb.0.all, na.action =
> na.omit, zero.policy = T, data = Cells)
>
>
>
>
>
> However the memory required is large - 16GB.
>
>
>
> This won't run on my computer,  I therefore had two questions:
>
>
>
> First, is the process I am doing correct - or can it be done in a more
> efficient way?
>
>
>
> Second, can I take a subsample of the data i.e every other cell and run the
> analysis? OR find a way of subsetting and "re-joining"?
>
>
>
> Thanks in advance,
>
>
>
> Chris
>
>
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list