[R-sig-ME] glmmTMB model specification - autocorrelation (John Wilson)

Highland Statistics Ltd h|gh@t@t @end|ng |rom h|gh@t@t@com
Wed Feb 2 10:15:10 CET 2022

Dear list,

I'm working on a large spatiotemporal dataset. It's a grid of 20 fixed
cells, laid out as A-E, 1-4 (think chessboard), and animals are counted in
each cell, about once an hour (but can change), during daylight, for about
one month a year. So lots of temporally adjacent samples, separated by ~10
h overnight, and separated by 11 months between years. Each hourly-ish
sampling, which goes over as many cells in the grid as possible given
weather, has a unique identifier (say, SamplingRound). There is a spatial
gradient of animals north-south (A-E) and also east-west (1-4).

I'm trying to run this using either temporal or spatial autocorrelation, to
see whether either one resolves the autocorrelation issues. Most of the
examples I've seen set up the autocorrelation within the random factor
group, and I can't figure out whether that applies here or how to do it. So
I have three questions - one per paragraph below.

I'm using glmmTMB so that I can use a negbin (potentially zero-inflated,
still sorting that out) while using an autocorrelation structure. I assume
that the random intercept is the individual cell (so, A1, A2, A3, A4, A5,
B1, ..., E5), since they're measured repeatedly. Assuming that's correct -
question #1) given the gradient across A-E and across 1-4, is it possible
to have 2 main fixed effects (one for A-E and one for 1-4) in addition to
the random effect of individual cell? I know that factors generally can't
be used for both, but in this case the fixed effects only have A-E OR 1-4
whereas the random effect has the individual cell ID.

Is it correct that temporal autocorrelation would need to be expressed
between sampling times within each sampling day, within each cell? This
results in a total of 6,000 groups (20 cells times 30 sampling days per
year, times 10 sampling years), and trying to run this gives a memory
error. Question #2: is this the correct specification and my computer just
can't handle it, or is it wrong? This is the setup of the random portion:
(1|Cell) + ou(time.within.day + 0 | YearDayF_Cell), where times.within.day
are a numFactor.

For spatial autocorrelation, I _think_ that I would need to express it
across all cells within each sampling round, so question #3: is it correct
to have the random portion be (1|Cell) + exp(pos + 0 | SamplingRound),
where pos is numFactor(data$Easting, data$Northing)?

Any advice would be very much appreciated!

John...you may want to have a look at R-INLA. It is relatively easy to 
implement a spatial-temporal GLM (and it can do a range of distributions 
for count data) in INLA. And you may need some smoothers of time as 
well. It can do all that.


Dr. Alain F. Zuur
Highland Statistics Ltd.
9 St Clair Wynd
AB41 6DZ Newburgh, UK
Email: highstat using highstat.com
URL:   www.highstat.com

More information about the R-sig-mixed-models mailing list