[R-sig-ME] Opinions on model structure: fixed and random effects

Tue Jan 19 01:59:36 CET 2016

Dear mixed model experts,

I am hoping to get opinions on a model structure, and specifically, 
whether Year, Season and Strata variables should be included as random 
effects, fixed effects, or both?

In a nutshell, I am building a species distribution model using 30+ 
years of fisheries trawl data with the main objective of using the model 
to predict fish distributions under future climate scenarios and the 
secondary objective of evaluating the relative importance of temperature 
predictor variables compared to static, landscape variables (e.g., 
depth, bottom type). The unit of observation is a trawl tow, which has 
an associated date (year, month, day), season (fall or spring) and 
strata (spatial identifier, where strata is a unique region based on 
biophysical characteristics and used for stratified random sampling 
purposes). Within our dataset we have multiple tows from the same strata 
within the same season and year. We will likely examine a few different 
frameworks (e.g., GLMM, GAMM, Boosted Regression Trees, Random Forests).

Taking the GLMM as an example, my plan is to do the following:
1) Include YEAR as a random effect. Although we are somewhat interested 
in the variability among all years, we are not specifically interested 
in completing year to year comparison between all years. However, if we 
were, it sounds like an interesting approach would be to include year as 
both a random and fixed effect, which would allow us to look at 
variability among years (random component) as well as trend and change 
over years (fixed component).
2) Include STRATA as a random effect. Strata, in many ways, is similar 
to the idea of a plot in a traditional plot-based or split-plot sampling 
design. Including it as a random effect accounts for the fact that 
multiple samples from the same strata are not truly independent. 
Additionally, we are not explicitly interested in comparing among 
strata. Therefore, including it as a random effect makes the most sense.
3) Include SEASON as a fixed effect. With only two options, it does not 
make sense to include season as a random effect. Additionally, we are 
interested in seasonal differences. On a related note, what if you had a 
temperature variable measured at a seasonal scale (i.e., spring or fall 
mean temp)? Would you drop season as a factor in the hopes that the 
seasonal variability was captured by the temperature variable?

Does this approach make sense?

Thank you in advance for your time and insight.

Sincerely,

Andrew

	[[alternative HTML version deleted]]