[R-sig-ME] Comparison of lme4, geepack for binary correlated variables
David Duffy
David.Duffy at qimr.edu.au
Mon Nov 21 08:51:24 CET 2011
On: Chen et al Genet Epidemiol 2011, 35:650-7.
The latest (dead tree) issue of Genetic Epidemiology has a paper using
simulated and real data to compare methods for testing association
between a measured genotype (fixed effect) and a dichotomous outcome in
pedigrees, so there is residual correlation between observations. They use
a) "ordinary" gaussian linear mixed model treating the trait as 0-1 (in
lmekin) b) the binomial-gaussian GLMM using glmer (0.999375-32) c) GEE
in geepack. Simulated data were produced under a threshold model and
AFAICT [I don't think the paper well-written], a Wald test was used to
assess the fixed effect for all three.
You can read the abstract, at least, online: they prefer GEE. Their GLMM
test Type-1 error tends to drift up a little as the trait prevalence
increases. They also experienced problems with GLMM when carrying small
sample simulations. They did encounter numerical problems with GEE when
the trair prevalence was low, but for this situation they preferred the
gaussian LMM, as they found this to have OK Type-I error rates, and
better power than the GLMM (though twice as slow ;)).
The main weakness of course if that they did not report LRTS results,
although they do mention Hauck-Donner effects as a possible cause of their
problems. Another possible one is the generating model, which is
convenient but different from the logistic-gaussian. And fitting the LMM
to binary variables does usually give correct Type I errors, but when a
true effect is present overestimates the evidence for association in my
experience.
Cheers, David Duffy.
--
| David Duffy (MBBS PhD) ,-_|\
| email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / *
| Epidemiology Unit, Queensland Institute of Medical Research \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
More information about the R-sig-mixed-models
mailing list