[R-sig-ME] Modelling proportion data in lme4

Thierry Onkelinx thierry.onkelinx at inbo.be
Thu Mar 30 16:36:58 CEST 2017

Dear Adriana,

Use binomial only when the raw proportion stem from n Bernouilli
trials. E.g. 25% of 20 trails (or 5/20). In your case that could be
the abundances of all species at each site. Use that as the weights.

Best regards,
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht

To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey

2017-03-30 11:41 GMT+02:00 Adriana De Palma <A.De-Palma op nhm.ac.uk>:
> Dear all,
> I'd be really grateful if someone could advise on the following issue I've come across.
> I have proportion data (non-integer, bounded between 0 and 1) as my response variable, in a model that requires nested random effects and weights, which makes lme4 the ideal choice. Using lme4 with a binomial error structure and logit link seems to produce reasonable (and realistic looking) results, and the residual plots look good. However, it warns me that the error structure expects integer data, and I don't know whether this approach is doing what I think (and hope) that it is doing. I have tried to validate the lme4 results in the following ways:
> 1.       Running the same method (binomial error structure and logit link with the proportions as the response variable) with glmmADMB. This produces very different results (they are completely unrealistic, e.g. predicted proportion of 2.16e-34).
> 2.       Using beta regression with glmmADMB. This seems to work and produce results that are on the same scale, but not that close to those of lme4.
> 3.       Running an lme4 model with normal errors (lmer), after logit-transforming the response variable. This again gives quite different results to the lme4 model with binomial error structure and logit link (and the behaviour of the residuals is not ideal).
> Since these all give different results, it's hard to tell whether the lme4 method I've used is giving the 'right' answer. I would be really grateful for any advice. Is lme4 correctly analysing the proportion data when a binomial error structure and logit link are specified?
> Additional note: the proportion data are compositional similarity measurements (Jaccard assymetric abundance-based compositional similarity), so technically there is a numerator and denominator (numerator = abundance of species in Site 1 that are also present in Site 2; denominator = abundance of all species in Site 1). I've been exploring different weights options, but they generally include the denominator.
> Many thanks in advance,
> Adriana
> _____
> Adriana De Palma
> PREDICTS Postdoctoral Research Assistant
> Natural History Museum
> South Kensington
> Web: The Purvis Lab<http://www.bio.ic.ac.uk/research/apurvis/ajpurvis.htm> | PREDICTS<predicts.org.uk>
>         [[alternative HTML version deleted]]
> _______________________________________________
> R-sig-mixed-models op r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

More information about the R-sig-mixed-models mailing list