[R-sig-ME] gee, geese and glmer

Thu Mar 13 21:09:22 CET 2014

On 14-03-13 02:01 PM, Yang, Qiong wrote:
> Hi Ben and Martin,
> 
> Thanks you for your reply and taking time to explain the situation. 
> Maybe I wasn't clear in my previous message or don't have a deeper
> understanding of the CRAN policy: Instead of separate packages for
> "old-CRAN" lme4 (lme4.0), and the current lme4, which is not allowed
> by CRAN maintainer, is it possible to have an option in current lme4
> to call the algorithm in old lme4? I thought the new lme4 implements
> a different algorithm and why not make the old algorithm available in
> current lme4 lmer() as an option. It is like we can request pearson
> or spearman correlations in cor(). Sorry if I simplified the problem
> too much.

   It's not so much CRAN policy as the architecture of the software.  A
great deal of the internal structure has changed between versions <1.0
and > 1.0, so putting both versions into one package would basically
mean having copies of all of the old and all of the new code -- and
keeping it all safely distinguished and separate would probably be
substantially more work (we have about as much as we can to do keep up
with a single package ...)  As we said before, we were surprised by the
rejection of lme4.0 by CRAN -- our previous planning had assumed that we
would be able to put lme4.0 on CRAN.  Somewhat to our surprise, there
have been relatively few people contacting us with troubles similar to
yours -- for the most part it seems that is (thankfully) very rare that
lme4 gives significantly worse answers than lme4.0.

  So far it's not clear that lme4's answers are actually worse than
lme4.0's (I admit that all other things being equal, closer to GEE is
more likely to be correct -- on the other hand, we know that GEE gives
marginal estimates of fixed parameters while GLMM gives conditional
estimates, so we shouldn't expect them to be the same).  Can I confirm:

* you've tried this with the most recent development version of lme4
(1.1-4) and you do *not* get any convergence warnings?

* have you compared the deviances based on the old (lme4.0 / lme4 < 1.0)
and new packages?  Here is the outline of how you would do this:

library(lme4)   ## load new package
m1 <- glmer(...) ## lme4 fit
library(lme4.0)  ## load old package
m0 <- lme4.0::glmer(...) ## lme4.0 fit
## extract full parameter set for lme4.0 and lme4 fits
m0parms <- c(getME(m0,"theta"),fixef(m0))
m1parms <- c(lme4::getME(m1,"theta"),fixef(m1))
nparms <- length(m0parms)
ntheta <- length(getME(m0,"theta"))

par(las=1,bty="l")
plot(m0parms,m1parms,col=rep(2:1,c(ntheta,nparms-ntheta)))
abline(a=0,b=1)

## set up deviance function
dd <- lme4::glmer(...,devFunOnly=TRUE)
## compare
dd(m0parms)
dd(m1parms)

  If the new deviance is lower than the old deviance, that's a strong
indication that the new version is actually getting a _better_ fit than
the old one.

  Another diagnostic tool is:

library(bbmle)
ss1 <- slice2D(fun = dd, params = m1parms, verbose = FALSE)
splom(ss1)

  This will show you cross-sections of the deviance surface ...

> To Ben, Sorry that we cannot share data because of confidentiality
> policy. But you are able to debug our problem through PC screen
> sharing where you can gain control and run programs on our Linux 
> session without possessing the data. If that sounds like a good idea,
> let us know when you might be available to do so.

  Can you generate a simulated example that replicates your problem?
simulate() is a useful tool for generating data that are similar, but
not identical to, your original data (you can also anonymize factor
labels etc.).

  PC screen sharing seems difficult -- I have easy access to MacOS and
Linux systems, but not Windows.

> 
> Thanks, Qiong -----Original Message----- From: Ben Bolker
> [mailto:bbolker at gmail.com] Sent: Tuesday, March 11, 2014 9:26 AM To:
> Martin Maechler; Yang, Qiong Cc: r-sig-mixed-models at r-project.org;
> Chen, Ming-Huei Subject: Re: [R-sig-ME] gee, geese and glmer
> 
> I would also point out that we are indeed very interested, in the
> medium term (the short term is very very busy!), in making sure that
> your issues are resolved.  We would like lme4 to dominate lme4.0
> (i.e., to work better in all circumstances). So far it's been a bit
> difficult since we have been debugging remotely -- short of the
> suggestions I gave below, and without access to a reproducible
> example, it's very hard indeed for me to say much more.
> 
> sincerely Ben Bolker
> 
> On 14-03-11 05:45 AM, Martin Maechler wrote:
>>>>>>> Yang, Qiong <qyang at bu.edu> on Mon, 10 Mar 2014 18:19:56
>>>>>>> +0000 writes:
>> 
>>> Hi Ben, We wonder if you can add an option in lmer() of current
>>> lme4 version to call the algorithm used in lme4_0.999999-2?
>> 
>> unfortunately not.
>> 
>> For this reason, we had planned for many months, starting in August
>>  2012 and announced on this mailing list at least a year ago that
>> we would provide the 'lme4.0' package  (back-compatible as well as
>>  possible in light of computer OS updates incl system libraries,
>> and R updates, ...) in order to provide useRs the possibility of 
>> reproducible research and data analysis for their analyses done
>> with "old-CRAN" lme4 versions.
>> 
>> Unfortunately (for us), the CRAN maintainers decided that providing
>>  lme4.0 in addition to lme4 was a bad idea, and explicit forbid
>> such actions in the (then new) CRAN policy document.  I'm still not
>> at all happy with that decision.
>> 
>>> For our package (analyze rare genetic variant) to be put on CRAN,
>>> we need to use current version of lme4. However, at this point,
>>> there are still issues that cannot be resolved with newer
>>> versions of lme4. It is very difficult resolved with newer
>>> versions of lme4. It is very difficult for us to keep waiting and
>>> testing the new release, and hope all the issues resolved and no
>>> new issues coming up. lme4_0.999999-2 has been used by us for a
>>> long time with little problem.
>> 
>> Good to hear.  Such cases were exactly the reason why we (lme4 
>> authors) made such considerable effort to provide  lme4.0 for 
>> reproducible research and data analysis.
>> 
>> at the bottom of 
>> https://github.com/lme4/lme4/blob/master/README.md we mention the
>> state and give installation instruction of lme4.0, but as you say,
>> this does not solve the problem for other package maintainers: If
>> they want a CRAN package, they (currently? I'm optimistic beyond
>> reason :-) cannot have a 'Depends: lme4.0'  (or "Imports:..." or
>> similar).
>> 
>>> Your help on this is highly appreciated.  Best, Qiong
>> 
>> You're welcome; currentl there's not more we can do. Martin
>> 
>> -- Martin Maechler, ETH Zurich, Switzerland
>> 
>> 
>> 
>>> -----Original Message----- From: Ben Bolker 
>>> [mailto:bbolker at gmail.com] Sent: Saturday, March 08, 2014 5:25 PM
>>> To: r-sig-mixed-models at r-project.org Cc: Chen, Ming-Huei; Yang,
>>> Qiong Subject: Re: gee, geese and glmer
>> 
>>> On 14-03-07 11:25 PM, Ming-Huei Chen wrote:
>>>> Hi Ben,
>>>> 
>>>> 
>>>> 
>>>> In an analysis we found that glmer in new lme4 gave result
>>>> different from old lme4, gee and geese, where old lme4 seems to
>>>> be closer to gee and geese.. Please see highlighted sex effect
>>>> below. Case by sex (2x2) table is also given. Can you please
>>>> let us know how would you look into the results? Thanks!
>>>> 
>> 
>>> [cc'ing to r-sig-mixed-models: **please** try r-sig-mixed-models
>>> first, not personal e-mail to me ...]
>> 
>>> I can't say exactly what's going here; without having a 
>>> reproducible example <http://tinyurl.com/reproducible-000> it's
>>> hard to say precisely.  Thoughts:
>> 
>>> * gee and geese are giving _exactly_ the same parameter 
>>> estimates, to 8 significant digits, so I would guess they are
>>> wrapping identical underlying methods.
>> 
>>> * As far as diagnosing the issue with lme4 1.0-6: * does changing
>>> the optimization method, i.e. 
>>> glmerControl(optimizer="optimx",optCtrl=list(method="nlminb")) 
>>> [must do library("optimx") first] or 
>>> glmerControl(optimizer="bobyqa")
>> 
>>> change the result?
>> 
>>> * I would be curious whether the soon-to-be-released version
>>> 1.1-4 (which can be installed from github or 
>>> lme4.r-forge.r-project.org/repos) gives either (1) convergence
>>> warnings or (2) different/better answers
>> 
>>> * You can try specifying the starting values for lme4 to diagnose
>>> misconvergence; for example, start lme4 from the estimates given
>>> by old lme4/lme4.0 and see if it gives a similar answer.
>> 
>>> * You can use the 'slice' and 'splom.slice' functions from bbmle
>>> to visualize the likelihood surfaces
>> 
>>> good luck, Ben Bolker
>> 
>>>> Ming-Huei
>>>> 
>> 
>>>> ###GEE
>>>> 
>>>>> summary(gee(case~sex+PC1+PC2+PC3+PC4,id=famid,family=binomial,data=da
>>
>>>>> 
>>> ta))$coef
>>>> Estimate Naive S.E.  Naive z Robust S.E.  Robust z (Intercept)
>>>> -1.88047373 0.13532162 -13.8963286 0.15960440 -11.782092 sex
>>>> -0.23436854 0.08611269 -2.7216494 0.09050577 -2.589543 PC1
>>>> -0.05478639 0.06195318 -0.8843192 0.06822178 -0.803063 PC2
>>>> -0.09934572 0.06494563 -1.5296753 0.06520811 -1.523518 PC3 
>>>> -0.07020391 0.06626875 -1.0593818 0.06962147 -1.008366 PC4
>>>> -0.13413097 0.06746716 -1.9880927 0.06979901 -1.921674
>>>> 
>> 
>>>> ###GEESE
>>>> 
>>>>> summary(geese(case~sex+PC1+PC2+PC3+PC4,id=famid,family=binomial,data=
>>
>>>>> 
>>> data))$mean
>>>> 
>>>> estimate san.se wald p
>>>> 
>>>> (Intercept) -1.88047373 0.15960440 138.8176912 0.000000000 sex
>>>> -0.23436854 0.09050577 6.7057312 0.009610351 PC1 -0.05478639
>>>> 0.06822178 0.6449102 0.421938319 PC2 -0.09934572 0.06520811
>>>> 2.3211071 0.127629159 PC3 -0.07020391 0.06962147 1.0168016 
>>>> 0.313278888 PC4 -0.13413097 0.06979901 3.6928324 0.054646745
>>>> 
>>>> ### lme4_0.999999-2
>>>> 
>>>>> summary(glmer(case~sex+PC1+PC2+PC3+PC4+(1|famid),family=binomial,data
>>
>>>>> 
>>> =data))
>>>> Estimate Std. Error z value Pr(>|z|) (Intercept) -3.01599 
>>>> 0.28305 -10.655 <2e-16 *** sex -0.41056 0.16285 -2.521 0.0117 *
>>>> PC1 -0.17116 0.12903 -1.326 0.1847 PC2 -0.15510 0.13382 -1.159
>>>> 0.2465 PC3 -0.19044 0.13580 -1.402 0.1608 PC4 0.02532 0.13732
>>>> 0.184 0.8537
>>>> 
>>>> ###lme4_1.0-6
>>>> 
>>>>> summary(glmer(case~sex+PC1+PC2+PC3+PC4+(1|famid),family=binomial,data
>>
>>>>> 
>>> =data))
>>>> 
>>>> Estimate Std. Error z value Pr(>|z|)
>>>> 
>>>> (Intercept) -10.2784 0.8631 -11.909 <2e-16 *** sex 0.3497 
>>>> 0.1975 1.770 0.0767 .  PC1 -0.3555 0.1623 -2.190 0.0285 * PC2
>>>> -0.1087 0.1653 -0.657 0.5109 PC3 -0.2242 0.1652 -1.357 0.1748
>>>> PC4 0.1103 0.1671 0.660 0.5091
>>>> 
>>>> Case by sex
>>>> 
>>>> 1 2 0 2554 3021 1 310 290
>>>> 
>> 
>>> _______________________________________________ 
>>> R-sig-mixed-models at r-project.org mailing list 
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> 
>