[R-sig-Geo] spatial autocorrelation in GAM residuals for large data set

Elizabeth Webb webbe @end|ng |rom u||@edu
Tue Aug 20 15:46:01 CEST 2019


Hello,

I have a large data set (~100k rows) containing observations at points (MODIS pixels) across the northern hemisphere.  I have created a GAM using the bam command in mgcv and I would like to check the model residuals for spatial autocorrelation.  

One idea is to use the DHARMa package (https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html#spatial-autocorrelation).  The code looks something like this:

    simulationOutput  <-   simulateResiduals(fittedModel = mymodel) # point at which R runs into memory problems
    testSpatialAutocorrelation(simulationOutput = simulationOutput, x =  data$latitude, y= data$longitude)

However, this runs into memory problems.  

Another idea is to use the following code, after this tutorial (http://www.flutterbys.com.au/stats/tut/tut8.4a.html):
    library(ape)
    library(fields)
    coords = cbind(data$longitude, data$latitude)     
    w = rdist(coords)  # point at which R runs into memory problems
    Moran.I(x = residuals(mymodel), w = w)

But this also runs into memory problems.  I have tried increasing the amount of memory allotted to R, but that just means R works for longer before timing out.  

So, two questions: (1) Is there a memory efficient way to check for spatial autocorrelation using Moran's I in large data sets? or (2) Is there another way to check for spatial autocorrelation (besides Moran's I) that won't have such memory problems?

Thanks in advance,

Elizabeth







   


More information about the R-sig-Geo mailing list