[R-sig-ME] Nested and/or crossed and 2 level random factor

Thu Jul 11 19:14:05 CEST 2013

Linda Bürgi <patili_buergi at ...> writes:

> 
> Dear All,

> I have two quick questions about my study design. For 4 years, once
> every season, we destructively sampled larvae on bushes (the same
> bushes every time) and measured parasitism on these larvae. We had
> 10 bushes per location and two locations.  We are interested in
> whether parasitism changed over the years and varied with
> season. With repeated measures on bushes, and bushes nested in
> location, my model looks like this:

> model<-glmmPQL(parasitism ~ year:season + year + season, 
>    random=~1|location/bush, family=binomial)
> 
> Question 1: A reviewer of our paper suggested that seasons are nested 
> within years and that we should include this in the model. However, I 
> think seasons are crossed with years, not nested. If that's the case, 
> can I leave the model as is (as far as season and years are concerned)?

  I agree with you and disagree with the reviewer (since the effect of
season could be expected to be similar in each year).  If you were
using lme4 you could considering making the year-by-season interaction
as a random effect (how many seasons are there?), e.g. (1|year:season)

> Question 2: I know it is ridiculous to have location as a random
> factor since it only has two levels. Could I alternatively just
> include it as a fixed factor like this: glmmPQL(parasitism ~
> year:season + year + season + location, random=~1|bush,
> family=binomial)? Totally leaving it out is not an option because
> levels of parasitism vary significantly with location (but that is
> of no interest to us, hence not really a fixed factor, just a
> covariate?).

  Absolutely.   If bushes are uniquely labeled then what you've
suggested is fine, otherwise you would want random~1|bush:location
to distinguish among (e.g.) bush 2 in location 1 and bush 2 in location 2

> Thank you already for any answers and suggestions!

> PS. I used glmmPQL instead of lmer because without the
> over-/underdispersion function in lmer everything was highly
> significant, whereas with glmmPQL it is not.

  Did you try an observation-level random effect?

  I'm a little concerned about the specification of parasitism:
it seems as though it's probably a proportion rather than a
binary variable (i.e. you sampled multiple individuals per
bush:location:season:year combination, and counted how many
were parasitized), in which case you should be including
the denominator (total) somewhere in your specification, either
as a response  cbind(n_parasitized,n_unparasitized) or by
including a 'weights' argument giving the total sample size
for each observation.