[R-sig-ME] lmer vs glmmPQL
Ken Beath
ken at kjbeath.com.au
Wed Jul 1 10:37:41 CEST 2009
On 01/07/2009, at 11:03 AM, Ben Bolker wrote:
> Fabian Scheipl wrote:
>> On Tue, Jun 30, 2009 at 9:16 AM, Ken Beath<ken at kjbeath.com.au> wrote:
>>> It appears that PQL with moderate random effect variance
>>> introduces a small
>>> bias in a direction that reduces the MSE, at least in the
>>> simulations
>>> chosen. For large variances the bias is probably excessive and the
>>> MSE will
>>> increase using PQL.
>
> Hmmm. How can bias, in any direction, reduce MSE? (I can see that
> there could be a tradeoff between bias and variance, but MSE
> incorporates bias^2, right? How about bias-corrected variants of
> PQL (a la Raudenbush et al) -- mights those provide the best
> of both worlds, or does the additional complexity inevitably
> increase variance -> MSE? (I don't know if those bias-corrected
> variants are implemented anywhere other than MLWiN/HLM ... ?)
>
By bias for PQL, I mean the difference from the "correct" maximum
likelihood estimates rather than from the true values.
>> Results from simulations with sd(RandomIntercept)=3 instead of 1
>> (results attached) confirm your remark - with the possible exception
>> of very small data sets the performance (in rmse & bias) for Laplace
>> and AGQ is much much better than PQL.
>> I'm sorry for getting Ben Bolker and others all riled up with my
>> earlier post.
>
> On the contrary, I think this is fascinating and worthwhile.
> It amazes me that we still don't know these very basic things.
>
The nice thing is that most of the time it doesn't make much
difference what approximation is used. Fixed effect estimates which is
usually what we are interested in are usually less biased than random
effect variance estimates.
>> One more thing to consider though:
>> A random intercept variance of 1 in a logistic model means that the
>> medium 50% of subjects/groups are expected to have between about
>> half
>> and about double the odds of a subject/group with random intercept=0,
>> which is already fairly large effect in my book.
>> ##
>>> qlnorm(c(.1, .25, .75, .9))
>> [1] 0.28 0.51 1.96 3.60
>> ##
>>
>> For a random intercept sd of 3, the multiplicative effect on the
>> baseline odds for the middle 50% is between 0.13 and 7.6,
>> ##
>>> qlnorm(c(.1, .25, .75, .9), sdlog = 3)
>> [1] 0.021 0.132 7.565 46.743
>> ##
>> which means really large inter-group/subject heterogeneity and might
>> not be encountered that frequently in real data (?) (or at least
>> suggest a mis-specified model that misses important
>> subject/group-level predictors...).
>>
>> (Similar remarks concerning "effect size" of the random effect apply
>> to Poisson regression with log-link.)
>>
>> So, what's the lesson --
>> Should we still prefer PQL if we expect to see small to intermediate
>> inter-group/subject heterogeneity?
>>
>> Fabian
>
> good question.
>
Provided the bias with either method is small then it isn't a problem
because there will always be other errors because of assumptions about
random effects distributions. There are a reasonable number of data
sets with small cluster size and high within cluster correlation where
we don't know the reasons for the correlation, simply because we don't
know the full causes. An example is many eye diseases.
Why I like the Laplace/AGQ methodology where you increase the
quadrature points until the fit isn't improved is that it removes one
possible problem.
Ken
>
> --
> Ben Bolker
> Associate professor, Biology Dep't, Univ. of Florida
> bolker at ufl.edu / www.zoology.ufl.edu/bolker
> GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list