[R-sig-ME] Comparison of lme4, geepack for binary correlated variables

Mon Nov 21 08:51:24 CET 2011

On: Chen et al Genet Epidemiol 2011, 35:650-7.

The latest (dead tree) issue of Genetic Epidemiology has a paper using
simulated and real data to compare methods for testing association
between a measured genotype (fixed effect) and a dichotomous outcome in
pedigrees, so there is residual correlation between observations. They use
a) "ordinary" gaussian linear mixed model treating the trait as 0-1 (in
lmekin) b) the binomial-gaussian GLMM using glmer (0.999375-32) c) GEE
in geepack.  Simulated data were produced under a threshold model and
AFAICT [I don't think the paper well-written], a Wald test was used to
assess the fixed effect for all three.

You can read the abstract, at least, online: they prefer GEE.  Their GLMM 
test Type-1 error tends to drift up a little as the trait prevalence 
increases.  They also experienced problems with GLMM when carrying small 
sample simulations.  They did encounter numerical problems with GEE when 
the trair prevalence was low, but for this situation they preferred the
gaussian LMM, as they found this to have OK Type-I error rates, and 
better power than the GLMM (though twice as slow ;)).

The main weakness of course if that they did not report LRTS results, 
although they do mention Hauck-Donner effects as a possible cause of their 
problems.  Another possible one is the generating model, which is 
convenient but different from the logistic-gaussian.  And fitting the LMM 
to binary variables does usually give correct Type I errors, but when a 
true effect is present overestimates the evidence for association in my 
experience.

Cheers, David Duffy.

-- 
| David Duffy (MBBS PhD)                                         ,-_|\
| email: davidD at qimr.edu.au  ph: INT+61+7+3362-0217 fax: -0101  /     *
| Epidemiology Unit, Queensland Institute of Medical Research   \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia  GPG 4D0B994A v