[R-sig-ME] Mixed Model Specification

Thu Jun 26 23:14:37 CEST 2014

On 14-06-26 05:00 PM, Worthington, Thomas A wrote:
> Dear All
> 
> I have a question about the use of a mixed effects model. I have
> presence/absence data for a mussel species collected at 25 sites. I
> wish to relate the presence/absence to a number of environmental
> variables and also want to take into account site. Is it feasible to
> use site as a random effect as I have only one replicate per site
> e.g.
> 
> M1<-glmer(Presence ~ Substrate, (1 | Site), family = binomial, data =
> data)
> 

  No  -- the among-site variance is not identifiable with binary
(Bernoulli/presence-absence) responses (unless the responses can be
grouped into sets of more than one observation with the same
covariates), see e.g. http://www.gllamm.org/JEBSredundant_07.pdf section
2.2.  You can probably find further discussion elsewhere on this list.

  In theory if you have a single categorical predictor (Substrate) you
could group your data into the form

  Substrate, num.present, num.absent

which gives a binomial (non-binary) response, but if you want to
estimate a fixed effect of substrate you still can't use an
observation-level random effect to model overdispersion because

  cbind(num.present,num.absent) ~ Substrate + (1|obs)

will still confound observation and substrate.

  Even if you could do this, 25 presence-absence samples is a very small
data set (sorry). The 'effective sample size' for binary variables is
max(# presence, # absence) (see e.g. Harrell _Regression Modeling
Strategies_), and the rule of thumb is that you should have 10-20
observations per parameter, so even in the best case (50%
presence/absence) you probably can't do very much more than estimate the
difference in probability of presence between 2 or 3 different kinds of
substrates.

 Ben Bolker