[R-sig-ME] glmmTMB model specification - autocorrelation

Tue Feb 1 14:13:58 CET 2022

Dear list,

I'm working on a large spatiotemporal dataset. It's a grid of 20  fixed
cells, laid out as A-E, 1-4 (think chessboard), and animals are counted in
each cell, about once an hour (but can change), during daylight, for about
one month a year. So lots of temporally adjacent samples, separated by ~10
h overnight, and separated by 11 months between years. Each hourly-ish
sampling, which goes over as many cells in the grid as possible given
weather, has a unique identifier (say, SamplingRound). There is a spatial
gradient of animals north-south (A-E) and also east-west (1-4).

I'm trying to run this using either temporal or spatial autocorrelation, to
see whether either one resolves the autocorrelation issues. Most of the
examples I've seen set up the autocorrelation within the random factor
group, and I can't figure out whether that applies here or how to do it. So
I have three questions - one per paragraph below.

I'm using glmmTMB so that I can use a negbin (potentially zero-inflated,
still sorting that out) while using an autocorrelation structure. I assume
that the random intercept is the individual cell (so, A1, A2, A3, A4, A5,
B1, ..., E5), since they're measured repeatedly. Assuming that's correct -
question #1) given the gradient across A-E and across 1-4, is it possible
to have 2 main fixed effects (one for A-E and one for 1-4) in addition to
the random effect of individual cell? I know that factors generally can't
be used for both, but in this case the fixed effects only have A-E OR 1-4
whereas the random effect has the individual cell ID.

Is it correct that temporal autocorrelation would need to be expressed
between sampling times within each sampling day, within each cell? This
results in a total of 6,000 groups (20 cells times 30 sampling days per
year, times 10 sampling years), and trying to run this gives a memory
error. Question #2: is this the correct specification  and my computer just
can't handle it, or is it wrong? This is the setup of the random portion:
(1|Cell) + ou(time.within.day + 0 | YearDayF_Cell), where times.within.day
are a numFactor.

For spatial autocorrelation, I _think_ that I would need to express it
across all cells within each sampling round, so question #3: is it correct
to have the random portion be (1|Cell) + exp(pos + 0 | SamplingRound),
where pos is numFactor(data$Easting, data$Northing)?

Any advice would be very much appreciated!
John

	[[alternative HTML version deleted]]