[R-sig-ME] why would using p-values of GLMM for distr other than Gaussian be correct?

Joshua Wiley jwiley.psych at gmail.com
Wed Sep 25 00:03:59 CEST 2013


Thank you, Ben.  That is exactly what I was trying to convey --- one
culture is used to asymptotic results, and the other to the
finite-size correction.

On Tue, Sep 24, 2013 at 7:39 AM, Ben Bolker <bbolker at gmail.com> wrote:
> Pablo Inchausti <pablo.inchausti.f at ...> writes:
>
>>
>> Hi Joshua,
>> Thanks for your response.
>> I tend to agree (intuitively) with you that when one has 50,000
>> observations and 1,000 groups for the variable modelled as random effect,
>> assuming a normal distribution for Wald test =coef/Se(coef) of the fixed
>> effects without bothering about degrees of freedom is reasonable. However,
>> the overwhelming majority of analyses deals with tens or at most a hundred
>> observations and with random effects defined by a factor with a small (but
>> generally greater than 5) number of categories. It is in this (often
>> encountered) context that the discussion of how to count the degrees of
>> freedom for the random effects seems to be critical. This tally of the
>> degrees of freedom lies between two extremes: as one (because only the
>> variance of the normally distributed random effect is estimated) or as the
>> number of categories minus one of the variable modelled as random effect.
>> In many (most?) cases, the assumption regarding the counting of the degrees
>> of freedom does make a difference for evaluating the significance of the
>> fixed effects.
>> The significance tests of the fixed effects requires having the degrees of
>> freedom of the model, which is why the library lme4 does not provide the
>> p-values when family=Gaussian but it does provide them whenever family !=
>> Gaussian, which was the question I posed in my mail. Other programs (SAS,
>> Statistica) take a position/assumption about the degrees of freedom of the
>> random effects that is at the very least debatable. DBates and others
>> recommend using Bayesian methods to estimate the p-vales and the Conf
>> Intervals, but the commonly available R functions only work for GLMM with
>> family =Gaussian and with independent random slopes and intercepts.
>> I hope that this mail helps clarify the questions I posed.
>> Cheers
>> Pablo
>>
>> On 23 September 2013 18:50, Joshua Wiley <jwiley.psych at ...> wrote:
>>
>> > Hi Pablo,
>> >
>> > I think it depends on the assumptions.  In theory with the right
>> > degrees of freedom, you could fit linear mixed effects models on a
>> > smaller sample reasonably.
>> >
>> > There are no degrees of freedom typically for glms, and GLMMs follow
>> > suit.  Things like logistic regression rely on large sample
>> > theory---you have a big enough sample degrees of freedom are
>> > effectively infinite---the parameters are normally distributed and a z
>> > test is fine.  The same would hold for linear mixed models.  If you
>> > had say, 50000 observations from 1000 groups, p values assuming z =
>> > b/se ~ Gaussian is pretty sensible.
>> >
>> > Cheers,
>> >
>> > Joshua
>> >
>
>  [snip snip snip]
>
>    Just to amplify Joshua's answer: I really think that the reason
> that p values are shown for GLMMs and not LMMs is cultural. The
> classic mixed model ANOVA world is (perhaps appropriately) somewhat
> obsessed with degrees of freedom, which translates to wanting to
> know what the real units of replication are so that proper inference
> can be done; the LMM concern inherits from this.  On the GLM(M) side,
> the *culture* is to rely on asymptotic theory.  There is theory
> about finite-size corrections for GLMs (without random effects),
> under the rubric of "Bartlett corrections", but it's not very
> widely known or used.  Thus, summary.lm (for example) reports
> t statistics (finite-size-corrected) while summary.glm reports Z
> statistics (asymptotic) ...
>
> There's more discussion of this at http://glmm.wikidot.com/faq#df :
> I might add a sentence or two explaining the cultural context.
>
>   Ben
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://joshuawiley.com/
Senior Analyst - Elkhart Group Ltd.
http://elkhartgroup.com



More information about the R-sig-mixed-models mailing list