[R-sig-ME] Opinions on model structure: fixed and random effects
Andrew Allyn
andrew.allyn at gmail.com
Tue Jan 19 01:59:36 CET 2016
Dear mixed model experts,
I am hoping to get opinions on a model structure, and specifically,
whether Year, Season and Strata variables should be included as random
effects, fixed effects, or both?
In a nutshell, I am building a species distribution model using 30+
years of fisheries trawl data with the main objective of using the model
to predict fish distributions under future climate scenarios and the
secondary objective of evaluating the relative importance of temperature
predictor variables compared to static, landscape variables (e.g.,
depth, bottom type). The unit of observation is a trawl tow, which has
an associated date (year, month, day), season (fall or spring) and
strata (spatial identifier, where strata is a unique region based on
biophysical characteristics and used for stratified random sampling
purposes). Within our dataset we have multiple tows from the same strata
within the same season and year. We will likely examine a few different
frameworks (e.g., GLMM, GAMM, Boosted Regression Trees, Random Forests).
Taking the GLMM as an example, my plan is to do the following:
1) Include YEAR as a random effect. Although we are somewhat interested
in the variability among all years, we are not specifically interested
in completing year to year comparison between all years. However, if we
were, it sounds like an interesting approach would be to include year as
both a random and fixed effect, which would allow us to look at
variability among years (random component) as well as trend and change
over years (fixed component).
2) Include STRATA as a random effect. Strata, in many ways, is similar
to the idea of a plot in a traditional plot-based or split-plot sampling
design. Including it as a random effect accounts for the fact that
multiple samples from the same strata are not truly independent.
Additionally, we are not explicitly interested in comparing among
strata. Therefore, including it as a random effect makes the most sense.
3) Include SEASON as a fixed effect. With only two options, it does not
make sense to include season as a random effect. Additionally, we are
interested in seasonal differences. On a related note, what if you had a
temperature variable measured at a seasonal scale (i.e., spring or fall
mean temp)? Would you drop season as a factor in the hopes that the
seasonal variability was captured by the temperature variable?
Does this approach make sense?
Thank you in advance for your time and insight.
Sincerely,
Andrew
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list