[R-sig-ME] Wald t-tests for glmer
Ben Bolker
bbolker at gmail.com
Thu Dec 29 03:07:08 CET 2011
Michael Allen <michaelcobballen at ...> writes:
> Dear List, I have a "Poisson-lognormal" mixed model (structure and
> output below). I would like to perform Wald t-tests to obtain
> P-values for the parameter estimates. I understand that one way to
> do this is to use the z-scores provided by summary() as the
> t-values. My question is what I should use as my df, given the model
> structure below. The y-data are bird counts recorded during 646
> surveys at 44 transect locations. So each transect was surveyed
> multiple times...thus, the inclusion of (1|trans), which is
> essentially the "name" indicating which of the 44 transects were
> surveyed (e.g., L_01, W_04, etc.). Data were overdispersed, so a
> random term representing individual-level variation was included
> (i.e., (1|obs)). Should the df for the t-tests be 646 - 1 = 645 or
> maybe 44 - 1 = 43 ? Is there another way to calculate them? Am I
> even doing the Wald tests correctly? Thanks in advance for your
> help.
It depends. As suggested in http://glmm.wikidot.com/faq , you
can either work out the degrees of freedom yourself based on standard
experimental design principles (i.e. is the variation in your
covariates crossed with your random strata, or is each stratum
homogeneous with respect to the covariates? in other words, do
you have a nested design or a randomized-block design (or a
split-plot design, in which case some covariates are nested
and others are randomized-block)?
I'm guessing from what's below that 'site' does not vary
within transects; I'm not sure about de; and avghtin1 is
continuous, so probably does vary within transects (but that's
just my guess). So I would *guess* that the appropriate df
for the test are:
site, site*avghtin1: estimated at "level 1" (transect level):
ntrans=44 - 4 estimated parameters = 40 df
de, avghtin1, I(avghin1^2), year: estimated at "level 0" (population level);
6 parameters (incl. intercept) estimated at level zero,
df = 646-(44+6) = 596
As suggested by http://glmm.wikidot.com/faq, you could check
these calculations by running the model as a linear mixed model
in LME and see what it reports for df. Also: these df are sufficiently
large that unless you're trying to get very accurate p-values
(e.g. testing at alpha level=0.001) the results will be nearly
the same as a Z test (unlimited df).
Also note (as discussed in the same old FAQ) that it's not
even clear whether the finite-sample embodied in the Wald t-test
is even appropriate for GLMMs ...
> library(lme4)
> library(arm)
> db$obs=1:nrow(db)
>
> b.dens.veg2=glmer(dens~year+site+de+avghtin1+
> I(avghtin1^2)+avghtin1*site+(1|trans)+(1|obs),data=db,family=poisson)
>
> > summary(b.dens.veg2)
> Generalized linear mixed model fit by the Laplace approximation
> Formula: dens ~ year + site + de + avghtin1 + I(avghtin1^2) +
> avghtin1*site + (1 | trans) + (1 | obs)
> Data: db
> AIC BIC logLik deviance
> 697.2 750.9 -336.6 673.2
> Random effects:
> Groups Name Variance Std.Dev.
> obs (Intercept) 0.094321 0.30712
> trans (Intercept) 0.049609 0.22273
> Number of obs: 646, groups: obs, 646; trans, 44
>
> Fixed effects:
> Estimate Std. Error z value
> Pr(>|z|)
> (Intercept) 0.36736 0.16844 2.181 0.0292 *
> year2 0.17644 0.07472
> 2.361 0.0182 *
> year3 0.02453 0.07537
> 0.326 0.7448
> siteP 0.41106 0.22915
> 1.794 0.0728 .
> siteW 0.31676 0.20634
> 1.535 0.1248
> deE -0.53292 0.08145 -6.543
> 6.02e-11 ***
> avghtin1 0.06967 0.18736 0.372
> 0.7100
> I(avghtin1^2) 0.06706 0.05755 1.165 0.2440
> siteP:avghtin1 -0.28355 0.17117 -1.657 0.0976 .
> siteW:avghtin1 0.31147 0.16166 1.927 0.0540 .
More information about the R-sig-mixed-models
mailing list