[R-sig-ME] Mixed Model Specification
Ben Bolker
bbolker at gmail.com
Thu Jun 26 23:14:37 CEST 2014
On 14-06-26 05:00 PM, Worthington, Thomas A wrote:
> Dear All
>
> I have a question about the use of a mixed effects model. I have
> presence/absence data for a mussel species collected at 25 sites. I
> wish to relate the presence/absence to a number of environmental
> variables and also want to take into account site. Is it feasible to
> use site as a random effect as I have only one replicate per site
> e.g.
>
> M1<-glmer(Presence ~ Substrate, (1 | Site), family = binomial, data =
> data)
>
No -- the among-site variance is not identifiable with binary
(Bernoulli/presence-absence) responses (unless the responses can be
grouped into sets of more than one observation with the same
covariates), see e.g. http://www.gllamm.org/JEBSredundant_07.pdf section
2.2. You can probably find further discussion elsewhere on this list.
In theory if you have a single categorical predictor (Substrate) you
could group your data into the form
Substrate, num.present, num.absent
which gives a binomial (non-binary) response, but if you want to
estimate a fixed effect of substrate you still can't use an
observation-level random effect to model overdispersion because
cbind(num.present,num.absent) ~ Substrate + (1|obs)
will still confound observation and substrate.
Even if you could do this, 25 presence-absence samples is a very small
data set (sorry). The 'effective sample size' for binary variables is
max(# presence, # absence) (see e.g. Harrell _Regression Modeling
Strategies_), and the rule of thumb is that you should have 10-20
observations per parameter, so even in the best case (50%
presence/absence) you probably can't do very much more than estimate the
difference in probability of presence between 2 or 3 different kinds of
substrates.
Ben Bolker
More information about the R-sig-mixed-models
mailing list