[R-sig-Geo] Removing spatial autocorrelation - Memory limits

Chris Mcowen chrismcowen at gmail.com
Mon Dec 12 12:48:07 CET 2011


Thanks Roger,

I am aware it may look like I did not read the help page but I did.

I tried to include the longlat argument and got this:

coords<-cbind(Cells$Lat,Cells$Lon)
nb1.5<-dnearneigh(coords,0,1.5, longlat = TRUE)
Warning message:
In dnearneigh(coords, 0, 1.5, longlat = TRUE) :
  Coordinates are not geographical: longlat argument wrong


> head(Cells$Lon)
[1] 121.75 120.75 121.25 121.75 122.25 119.75
> head(Cells$Lat)
[1] 41.25 40.75 40.75 40.75 40.75 40.25



In regards to the formula in the errorsarlm, I was following the method of
Kissling and Carl 2008:

#######################
#OLS model for organism 1
ols<-lm(data$organism1~data$rain+data$jungle)
summary(ols)
res.ols <- residuals(ols)

#######################

#######################
#SARerr model with neighbourhood distance 1.5 and coding style "W"

#Specify SARerr model
sem.nb1.5.w<-errorsarlm(ols, listw=nb1.5.w)
summary(sem.nb1.5.w)
res.sem.nb1.5.w <- residuals(sem.nb1.5.w)

#######################



In regards to the sparse matrix:

If I am honest, I was unaware of what this does.. I had seen it used in
situations with large datasets, but methodologically I am unsure what it is
doing - I ran it and it completed very quickly, I was unsure why this was
the case. I appreciate I should read up on the methodology behind this
before reporting my results.

Thanks

Chris



-----Original Message-----
From: Roger Bivand [mailto:Roger.Bivand at nhh.no] 
Sent: 12 December 2011 10:51
To: Chris Mcowen
Cc: r-sig-geo at r-project.org
Subject: Re: [R-sig-Geo] Removing spatial autocorrelation - Memory limits

On Mon, 12 Dec 2011, Chris Mcowen wrote:

> Dear List,
>
> I am trying to model variation is fisheries catch, I have a large data 
> set (36574) cells , with knowledge of the tonnes of fish caught in each 
> "Cell" ( .5 degree) global cells.
>
> At present I am simply wanting to investigate if certain "area" (clusters
of
> cells) have high numbers of fish than others
>
> Due to the grid nature of the data set I have significant spatial
> autocorrelation in the data set.
>
> I have tried:
>
> REALM_gls <- gls(tonnesperkm_log~REALM, correlation = corGaus(form =~Lat +
> Lon), data = Cells)
>
> coords<-cbind(Cells$Lat,Cells$Lon)
> coords<-as.matrix(coords)
> nb1.5<-dnearneigh(coords,0,1.5)
> nb1.5.w<-nb2listw(nb1.5, glist=NULL, style="W", zero.policy=TRUE)
> ols_lm<-lm(Cells$tonnesperkm_log~Cells$REALM)
> ols_lm_error_REALM<-errorsarlm(ols_lm, listw=nb1.5.w, na.action = na.omit,
> zero.policy = T, data = Cells)

This is not right, the first argument to errorsarlm() is a formula object. 
Did you look at the help page? If you did, did you look at the method= 
argument, which offers among others sparse matrix methods for larger N?
You may have no-neighbour observations, just see the help page. In 
addition, you have geographical coordinates, so should set the arguments 
to dnearneigh appropriately - here they are assuming a planar surface, but 
may be OK to recover neighbours.

> llk1 <- knn2nb(knearneigh(coords, k=1, longlat=FALSE))
> col.nb.0.all <- dnearneigh(coords, 0, llk1)

Wrong function argument again, llk1 is an nb object, not a scalar 
distance. Again, please do read the help pages carefully. You need to find 
the maximum of the first nearest neighbour distances, but I advise against 
this here.

Advice, please do read the help pages carefully, and relevant literature, 
for example ASDAR as as www.asdar-book.org.

Roger


> col.nb.0.all
> summary(col.nb.0.all)
> ols_error_REALM2<-errorsarlm(ols_lm, listw=col.nb.0.all, na.action =
> na.omit, zero.policy = T, data = Cells)
>
>
>
>
>
> However the memory required is large - 16GB.
>
>
>
> This won't run on my computer,  I therefore had two questions:
>
>
>
> First, is the process I am doing correct - or can it be done in a more
> efficient way?
>
>
>
> Second, can I take a subsample of the data i.e every other cell and run
the
> analysis? OR find a way of subsetting and "re-joining"?
>
>
>
> Thanks in advance,
>
>
>
> Chris
>
>
>
>
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Department of Economics, NHH Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list