[R] lmer and a response that is a proportion
Cameron Gillies
cgillies at ualberta.ca
Mon Dec 4 00:30:46 CET 2006
Dear Brian and John,
Thanks for your insight. I'll clarify a couple of things incase it changes
your advice.
My response is a ratio of two measures taken during a bird's path, which
varies from 0 to 1, so I cannot convert it columns of the number of
successes. It has to be reported as the proportion. I could logit
transform it to make it normal, but I am trying to avoid that so I can
analyze it directly.
The subjects are individual birds and I have a range of sample sizes from
each bird (from 8 to >200, average of about 75 measurements/bird).
Thanks!
Cam
On 12/3/06 3:47 PM, "Prof Brian Ripley" <ripley at stats.ox.ac.uk> wrote:
> On Sun, 3 Dec 2006, John Fox wrote:
>
>> Dear Cameron,
>>
>>> -----Original Message-----
>>> From: r-help-bounces at stat.math.ethz.ch
>>> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Cameron Gillies
>>> Sent: Sunday, December 03, 2006 1:58 PM
>>> To: r-help at stat.math.ethz.ch
>>> Subject: [R] lmer and a response that is a proportion
>>>
>>> Greetings all,
>>>
>>> I am using lmer (lme4 package) to analyze data where the
>>> response is a proportion (0 to 1). It appears to work, but I
>>> am wondering if the analysis is treating the response
>>> appropriately - i.e. can lmer do this?
>>>
>>
>> As far as I know, you can specify the response as a proportion, in which
>> case the binomial counts would be given via the weights argument -- at least
>> that's how it's done in glm(). An alternative that should be equivalent is
>> to specify a two-column matrix with counts of "successes" and "failures" as
>> the response. Simply giving the proportion of successes without the counts
>> wouldn't be appropriate.
>>
>>> I have used both family=binomial and quasibinomial - is one
>>> more appropriate when the response is a proportion? The
>>> coefficient estimates are identical, but the standard errors
>>> are larger with family=binomial.
>>>
>>
>> The difference is that in the binomial family the dispersion is fixed to 1,
>> while in the quasibinomial family it is estimated as a free parameter. If
>> the standard errors are larger with family=binomial, then that suggests that
>> the data are underdispersed (relative to the binomial); if the difference is
>> substantial -- the factor is just the square root of the estimated
>> dispersion -- then the binomial model is probably not appropriate for the
>> data.
>
> John's last deduction is appropriate to a GLM, but not necessarily to a
> GLMM. I don't have detailed experience with lmer for binomial, but I do
> for various other fitting routines for GLMM. Remember there are at least
> two sources of randomness in a GLMM, and let us keep it simple and have
> just a subject effect and a measurement error. Then if over-dispersion is
> happening within subjects, forcing the binomial dispersion (at the
> measurement level) to 1 tends to increase the estimate of the
> subject-level variance component to compensate, and in turn increase some
> of the standard errors.
>
> (Please note the 'tends' in that para, as the details of the design do
> matter. For cognescenti, think about plot and sub-plot treatments in a
> split-plot design.)
More information about the R-help
mailing list