[R-sig-Geo] Spatial filtering with glm with grid sampling

Roger Bivand Roger.Bivand at nhh.no
Fri Jan 12 13:45:53 CET 2018


On Thu, 11 Jan 2018, Sima Usvyatsov wrote:

> I had no idea this was coming up - can't believe it didn't come up in any
> of my searches. Looks super useful, and I'm sure I'll be using it sooner
> than later, whether on this analysis or not. The worked example is
> ridiculously useful - so glad to have it.
>
> If I'm reading the help page correctly (the autocorrelation bits,
> struc(terms|group)) - to replicate the spatial autocorrelation within
> Round, would I add this?

Not sure - please report back if you find an equivalent to the glmmPQL 
approach.

Roger

>
> exp(site.Easting + site.Northing | RoundStart)
>
> So something like this:
>
> mod1 <- glmmTMB(Count ~ Stratum + SiteInStratum + ...other predictors +
> # random variable
> (1 | RoundStart) +
> # autocorrelation
> exp(site.Easting + site.Northing | RoundStart),
>
> family = nbinom2, # or nbinom1 - I guess decide based on residuals?
> correlation = corExp(form=~site.Easting + site.Northing + RoundStart)
>
> Does that look like the twin of the glmmPQL specification I had (other than
> family)? I can also email sig-ME if it's a better forum for this.
>
> On Thu, Jan 11, 2018 at 8:33 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
>
>> On Thu, 11 Jan 2018, Sima Usvyatsov wrote:
>>
>> What other GLMM options do I have, under the restrictions of count data
>>> with lots of zeroes and spatial autocorrelation? I was under the
>>> impression
>>> that glmmPQL was my only choice?
>>>
>>
>> See for example:
>>
>> https://www.fromthebottomoftheheap.net/2017/05/04/compare-
>> mgcv-with-glmmTMB/
>>
>> https://rjournal.github.io/archive/2017/RJ-2017-066/index.html
>> (forthcoming)
>>
>> for surveys and discussions; maybe your current choice is the best bet for
>> negbin.
>>
>>
>> Roger
>>
>>
>>> On Thu, Jan 11, 2018 at 8:18 AM, Roger Bivand <Roger.Bivand at nhh.no>
>>> wrote:
>>>
>>> On Thu, 11 Jan 2018, Sima Usvyatsov wrote:
>>>>
>>>> Thank you so much for your response.
>>>>
>>>>>
>>>>> Yes, I managed to muck up the fake data in 2 (!) ways - the 1,000 lons
>>>>> and
>>>>> the fact that the lon/lats weren't repeated. Here's the correct
>>>>> structure.
>>>>>
>>>>> df <- data.frame(Loc = as.factor(rep(1:20, each = 5)), Lat =
>>>>> rep(rnorm(20,
>>>>> 30, 0.1), each = 5), Lon = rep(rnorm(20, -75, 1), each = 5), x =
>>>>> rnegbin(100, 1, 1),
>>>>> Stratum = rep(1:5, each = 20))
>>>>>
>>>>> In the meantime, I (somewhat) resolved the issue with the glmmPQL fixed
>>>>> effects - my fault, of course.
>>>>>
>>>>> My current model is set up as follows:
>>>>>
>>>>> mod1 <- glmmPQL(Count ~ Stratum + SiteInStratum + ...other predictors,
>>>>> random = ~ 1 |RoundStart,
>>>>> family = quasipoisson,
>>>>> correlation = corExp(form=~site.Easting + site.Northing + RoundStart)
>>>>>
>>>>> where RoundStart is the date/time of starting each count. I'm assuming
>>>>> that
>>>>> by "using a standard GLMM" you were thinking of MASS's glmmPQL()?
>>>>> Does this look correct for specifying the full space/time dependence?
>>>>> The
>>>>> variogram and pacf look very decent, but one can never have too many
>>>>> checks...
>>>>>
>>>>>
>>>> I can't judge whether it really makes sense, but I think this is much
>>>> more
>>>> robust. I'd explore some other GLMMs too, to see what they contribute.
>>>>
>>>> Roger
>>>>
>>>>
>>>>
>>>> On Thu, Jan 11, 2018 at 5:01 AM, Roger Bivand <Roger.Bivand at nhh.no>
>>>>> wrote:
>>>>>
>>>>> On Wed, 10 Jan 2018, Sima Usvyatsov wrote:
>>>>>
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>>
>>>>>>> I am running a negative binomial model (MASS) on count data collected
>>>>>>> on a
>>>>>>> grid. The dataset is large - ~4,000 points, with many predictors.
>>>>>>> Being
>>>>>>> counts, there are a lot of zeroes. All data are collected on a grid
>>>>>>> with
>>>>>>> 20
>>>>>>> points, with high spatial autocorrelation.
>>>>>>>
>>>>>>> I would like to filter out the spatial autocorrelation. My question
>>>>>>> is:
>>>>>>> since I have very limited spatial info (only 20 distinct spatial
>>>>>>> locations), is it possible to simplify ME() so that I don't have to
>>>>>>> run
>>>>>>> it
>>>>>>> on the whole dataset? When I try to run ME() on a 100-point subset of
>>>>>>> the
>>>>>>> data, I get error in glm.fit: NA/NaN/Inf in 'x'. When I run it on a
>>>>>>> single
>>>>>>> instance of the grid, I "get away" with a warning ("algorithm did not
>>>>>>> converge").
>>>>>>>
>>>>>>> Here's a fake dataset. It was grinding for a while but not throwing
>>>>>>> errors
>>>>>>> (like my original data would). Regardless, it demonstrates the
>>>>>>> repeated
>>>>>>> sampling at the same points and the large number of zeroes.
>>>>>>>
>>>>>>>
>>>>>>> The data set has 1000 values in Lon, so is probably bigger than you
>>>>>> intended, and when 100 is used is not autocorrelated. You seem to have
>>>>>> a
>>>>>> hierarchical model, with repeated measurements at the locations, so a
>>>>>> multi-level treatment of some kind may be sensible. If you want to stay
>>>>>> with ME-based spatial filtering, maybe look at the literature on
>>>>>> spatial
>>>>>> panel (repeated measurements are in time) with ME/SF, and on network
>>>>>> autocorrelation (dyadic relationships with autocorrelation among
>>>>>> origins
>>>>>> and/or destinations). Both these cases use Kronecker products on the
>>>>>> selected eigenvectors, I think.
>>>>>>
>>>>>> Alternatively, use a standard GLMM with a grouped iid random effect
>>>>>> and/or
>>>>>> a spatially structured random effect at the 20 location level. If the
>>>>>> groups are repeated observations in time, you should model the whole
>>>>>> (non-)separable space-time process.
>>>>>>
>>>>>> Hope this helps,
>>>>>>
>>>>>> Roger
>>>>>>
>>>>>>
>>>>>> Any advice would be most welcome.
>>>>>>
>>>>>>>
>>>>>>> library(spdep)
>>>>>>> library(MASS)
>>>>>>>
>>>>>>> df <- data.frame(Loc = as.factor(rep(1:20, each = 5)), Lat =
>>>>>>> rnorm(100,
>>>>>>> 30,
>>>>>>> 0.1), Lon = rnorm(1000, -75, 1), x = rnegbin(100, 1, 1))
>>>>>>> coordinates(df) <- ~Lon + Lat
>>>>>>> proj4string(df) <- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84
>>>>>>> +no_defs")
>>>>>>> nb <- dnearneigh(x=coordinates(df), d1=0, d2=200,longlat = TRUE)
>>>>>>> dists <- nbdists(nb, coordinates(df), longlat=TRUE)
>>>>>>> glist <- lapply(dists, function(x) 1/x)
>>>>>>> lw <- nb2listw(nb, glist,
>>>>>>> <https://maps.google.com/?q=tw(nb,+glist,&entry=gmail&source=g>
>>>>>>> style="W")
>>>>>>> me <- ME(x ~ 1, data = df, family = "quasipoisson", listw = lw, alpha
>>>>>>> =
>>>>>>> 0.5)
>>>>>>>
>>>>>>>         [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-sig-Geo mailing list
>>>>>>> R-sig-Geo at r-project.org
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>> Roger Bivand
>>>>>> Department of Economics, Norwegian School of Economics,
>>>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>>>> voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
>>>>>> Editor-in-Chief of The R Journal, https://journal.r-project.org/
>>>>>> index.html
>>>>>> http://orcid.org/0000-0003-2392-6140
>>>>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>> Roger Bivand
>>>> Department of Economics, Norwegian School of Economics,
>>>> Helleveien 30, N-5045 Bergen, Norway.
>>>> voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
>>>> Editor-in-Chief of The R Journal, https://journal.r-project.org/
>>>> index.html
>>>> http://orcid.org/0000-0003-2392-6140
>>>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>>>
>>>>
>>>
>> --
>> Roger Bivand
>> Department of Economics, Norwegian School of Economics,
>> Helleveien 30, N-5045 Bergen, Norway.
>> voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
>> Editor-in-Chief of The R Journal, https://journal.r-project.org/index.html
>> http://orcid.org/0000-0003-2392-6140
>> https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en
>>
>

-- 
Roger Bivand
Department of Economics, Norwegian School of Economics,
Helleveien 30, N-5045 Bergen, Norway.
voice: +47 55 95 93 55; e-mail: Roger.Bivand at nhh.no
Editor-in-Chief of The R Journal, https://journal.r-project.org/index.html
http://orcid.org/0000-0003-2392-6140
https://scholar.google.no/citations?user=AWeghB0AAAAJ&hl=en



More information about the R-sig-Geo mailing list