[R-sig-ME] lme4 convergence/numerical issue with large sample 2 level logit
Daniel Adkins
deadkins at vcu.edu
Sun May 29 03:39:57 CEST 2011
Dennis,
Very kind of you to help me out - I really do appreciate it and hope
to return the favor some day.
I checked out the book chapters/course notes and did not really find
anything directly germane to my current problem. I also tried lme4a
'glmer' w/ 'nAGQ' but it only allows values of 0 and 1, so no help
there.
My sense is that my best bet is going to be using start values from a
random intercept logit (which will estimate nicely using 'lmer') in
the random intercept random slope logit model that I am trying to fit.
I know this is a very basic question, and not really worthy of your
time, but could you please share syntax for inputting starts to lmer?
For instance, in the simple model: "proto <- lmer(hibpe ~ age + (age
| id), nAGQ =250, family=binomial, data=data)"
How would I code start values for: fixed intercept & age coefficients,
and random intercept variance? Also, do you share my hunch that this
might facilitate/speed convergence?
Best,
D
--
Daniel E. Adkins, PhD
Assistant Professor
Center for Biomarker Research and Personalized Medicine
School of Pharmacy
Virginia Commonwealth University
McGuire Hall, Room 216B
1112 East Clay Street
Richmond, VA 23298
-----Dennis Murphy <djmuser at gmail.com> wrote: -----
To: Daniel Adkins <deadkins at vcu.edu>
From: Dennis Murphy <djmuser at gmail.com>
Date: 05/28/2011 03:40AM
Subject: Re: [R-sig-ME] lme4 convergence/numerical issue with large
sample 2 level logit
Hi:
On Fri, May 27, 2011 at 2:36 PM, Daniel Adkins <deadkins at vcu.edu> wrote:
> Dennis,
> I figured it out. Thanks so much!
>
> I am using the syntax that I would normally use with lme4. this is
> correct, no? also, lme4a doesn't seem to allow nAGQ - is that your
> experience. is there other syntax for specifying quadrature points or
> is it laplace approx only?
For the most part, but there are functions and algorithms in lme4a
that don't exist in lme4. A good reference would be the draft book on
lme4 that Doug started writing up last year. It can be found at
http://lme4.r-forge.r-project.org/
Look for the draft chapters link. If that isn't enough material, go to
the slides link and grab notes from the most recent offerings of his
mixed models course. Doug has given the course twice in 2011, so the
most recent documentation would be found there. The course notes are
more complete in topic coverage than is the book draft at this point
in time.
Re your question, the help page of lmer() in lme4a indicates that
there is an nAGQ argument in glmer() and nlmer(), but not in lmer()
itself:
nAGQ an integer - the number of points per axis for evaluating the
adaptive Gauss-Hermite approximation to the log-likelihood. This
defaults to 1, corresponding to the Laplacian approximation. Values
greater than 1 produce greater accuracy in the evaluation of the
log-likelihood at the expense of speed. A value of zero use a quicker
but less exact form of parameter estimation for nonlinear models by
optimizing the random effects and the fixed-effects coefficients in
the penalized least squares step.
Dennis
>
> D
>
> On Fri, May 27, 2011 at 4:53 PM, Dennis Murphy <djmuser at gmail.com> wrote:
>> What's your sessionInfo() ?
>>
>> On Fri, May 27, 2011 at 1:44 PM, Daniel Adkins <deadkins at vcu.edu> wrote:
>>> Dennis,
>>> Thank you for your help. I save installed the "Matrix" package several
>>> different ways (until i am currently blue in the face) and still
>>> receive the following error:
>>>
>>>> library(lme4a)
>>> Error: package 'Matrix' required by 'lme4a' could not be found
>>>
>>> Thoughts?
>>>
>>> Best,
>>> D
>>>
>>> On Fri, May 27, 2011 at 4:16 PM, Dennis Murphy <djmuser at gmail.com> wrote:
>>>> Hi:
>>>>
>>>> Re installation of lme4a on Win7, there are at least two ways to do it:
>>>>
>>>> (1) Within an R session, copy and paste the following:
>>>> install.packages("lme4a", repos="http://R-Forge.R-project.org")
>>>> install.packages("Matrix", repos="http://R-Forge.R-project.org")
>>>>
>>>> (2) Go to http://r-forge.r-project.org/bin/windows/contrib/latest/
>>>> and download the latest binaries for lme4a and Matrix as zip files
>>>>
>>>> In R, go to 'Install packages from local zip files' from the
>>>> Packages menu in Rgui, find the two zip files and then install them.
>>>> You need the development version of Matrix on R-forge for
>>>> compatibility with lme4a (true at least up to May 15 or so, but double
>>>> check). The last time I installed it, I got an error on startup that
>>>> was eliminated when I upgraded Matrix to the development version on
>>>> R-forge; that's why I suggest installing both at the same time. Here
>>>> are the versions of each that I have:
>>>>
>>>> < from sessionInfo() >
>>>> other attached packages:
>>>> [1] lme4a_0.999375-66 MatrixModels_0.2-1 minqa_1.1.15 Rcpp_0.9.4
>>>> [5] Matrix_0.9996875-0 rJava_0.8-8 sos_1.3-0 brew_1.0-6
>>>> [9] lattice_0.19-26 ggplot2_0.8.9 proto_0.3-9.2 reshape_0.8.4
>>>> [13] plyr_1.5.2
>>>>
>>>> AFAICT, both are working fine in concert and both methods mentioned
>>>> above are equivalent; the difference is that method (2) requires more
>>>> work on your part :)
>>>>
>>>> I'm cc-ing this to Doug just in case he's not aware of the 'fix' on
>>>> Win 7 systems for lme4a (he most likely does, but better to be safe).
>>>>
>>>> Happy mixed modeling :)
>>>> Dennis
>>>>
>>>> On Fri, May 27, 2011 at 12:35 PM, Daniel Adkins <deadkins at vcu.edu> wrote:
>>>>> doug,
>>>>> thanks for your reply. i do tons of mixed modeling, so obviously am a
>>>>> big fan of / highly dependent on your work.
>>>>>
>>>>> results from:
>>>>>>sessionInfo()
>>>>>
>>>>> R version 2.13.0 (2011-04-13)
>>>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>>>
>>>>> locale:
>>>>> [1] LC_COLLATE=English_United States.1252
>>>>> [2] LC_CTYPE=English_United States.1252
>>>>> [3] LC_MONETARY=English_United States.1252
>>>>> [4] LC_NUMERIC=C
>>>>> [5] LC_TIME=English_United States.1252
>>>>>
>>>>> attached base packages:
>>>>> [1] stats graphics grDevices utils datasets methods base
>>>>>
>>>>> other attached packages:
>>>>> [1] MCMCglmm_2.12 corpcor_1.5.7 ape_2.7-1 coda_0.14-4
>>>>> [5] tensorA_0.36 lme4_0.999375-39 Matrix_0.999375-50 lattice_0.19-23
>>>>> [9] nlme_3.1-100 foreign_0.8-43
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] gee_4.13-16 grid_2.13.0 stats4_2.13.0 tools_2.13.0
>>>>>
>>>>> i apologize for trying MCMCglmm, feel like i have lipstick on my collar (ha!).
>>>>>
>>>>> Please do tell me how to install "lme4a", I have downloaded the zip
>>>>> file for windows
>>>>> (https://r-forge.r-project.org/bin/windows/contrib/latest/lme4a_0.999375-66.zip),
>>>>> but haven't figured out how to load it into working session of R.
>>>>>
>>>>> also, could you point me in the right direction for start values?
>>>>> would it appreciably help optimization to use results from random
>>>>> intercept only model as starts for fixed effects and random intercept
>>>>> variance estimate?
>>>>>
>>>>> also, what do you think of "glmmPQL"? does that yield decent estimates
>>>>> for models of my level of size/complexity?
>>>>>
>>>>> Thx for any help.
>>>>> D
>>>>>
>>>>>
>>>>> On Fri, May 27, 2011 at 2:14 PM, <dmbates at gmail.com> wrote:
>>>>>> On Thu, May 26, 2011 at 8:53 PM, Daniel Adkins <deadkins at vcu.edu> wrote:
>>>>>>> hi all,
>>>>>>> trying to fit either this model:
>>>>>>>
>>>>>>> proto1 <- lmer(hibpe ~ age + b + b_age + h + h_age + female +
>>>>>>> female_age + bxf + hxf + numwaves + dead + nodoctor + nohosp
>>>>>>> +(1|hhidpn) +(0 + age | hhidpn) , nAGQ =300, family=binomial,
>>>>>>> data=hrs_data, na.action =na.omit, verbose=TRUE)
>>>>>>>
>>>>>>> or the same model allowing correlation btwn the REs:
>>>>>>>
>>>>>>> proto2 <- lmer(hibpe ~ age + b + b_age + h + h_age + female +
>>>>>>> female_age + bxf + hxf + numwaves + dead + nodoctor + nohosp
>>>>>>> + (age | hhidpn) , nAGQ =300, family=binomial, data=hrs_data,
>>>>>>> na.action =na.omit, verbose=TRUE)
>>>>>>>
>>>>>>> it is big data: ~50K obs & ~8400 clusters (subjects, in this case).
>>>>>>>
>>>>>>> when i try to fit these models they run ~forever (>72 hours) with no
>>>>>>> convergence. when i try a laplace approx for the quadrature (i.e.,
>>>>>>> nAGQ=1) I get false convergence.
>>>>>>>
>>>>>>> I can successfully fit a slightly less complex model in which
>>>>>>> everything is the same except dropping the age RE:
>>>>>>>
>>>>>>> proto2 <- lmer(hibpe ~ age + b + b_age + h + h_age + female +
>>>>>>> female_age + bxf + hxf + numwaves + dead + nodoctor + nohosp
>>>>>>> + (1 | hhidpn) , nAGQ =300, family=binomial, data=hrs_data,
>>>>>>> na.action =na.omit, verbose=TRUE)
>>>>>>>
>>>>>>> this takes ~25 minutes to converge, and it gives an accurate solution
>>>>>>> (cross-validated in stata via xtlogit and (for fixed effects) using
>>>>>>> simple, single level logit). notably, i only get global optimum
>>>>>>> convergence once I set nAGQ>200.
>>>>>>
>>>>>> nAGQ > 200 ? That seems rather drastic when you have an average of 6
>>>>>> observations per subject.
>>>>>>
>>>>>> The problem may be related to the optimizer. The development version
>>>>>> of the package, called lme4a, uses another optimizer that has, in most
>>>>>> but not all cases, been faster and more reliable. You may want to try
>>>>>> that version of the package instead. If you can tell us what platform
>>>>>> you are using, say by providing the output of
>>>>>>
>>>>>> sessionInfo()
>>>>>>
>>>>>> we should be able to provide you with installation instructions for
>>>>>> lme4a (although I think that installation on Mac OS X is still not
>>>>>> straightforward).
>>>>>>
>>>>>>
>>>>>>> Obviously, the estimation of the 2nd RE is the issue and a very
>>>>>>> difficult numerical problem given data size/structure. Stata
>>>>>>> (xtmelogit) just conks out almost immediately and says that it doesn't
>>>>>>> have the memory to handle the quadrature optimization (not in so many
>>>>>>> words, but still), even with memory maxed out for the
>>>>>>> program/hardware.
>>>>>>>
>>>>>>> Any suggestions for shortcuts to make 'proto1' or 'proto2' estimate? i
>>>>>>> am about to have to resort to BUGS, which will take eons, but at least
>>>>>>> will eventually give decent estimates. Would strongly prefer sticking
>>>>>>> with R though. Start values maybe? Would that help? Any code for
>>>>>>> inputting these (i have never done it before in R)? nlme? hglm? I am
>>>>>>> desperate/open to any suggestions.
>>>>>>>
>>>>>>> Thx,
>>>>>>> D
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Daniel E. Adkins, PhD
>>>>>>> Assistant Professor
>>>>>>> Center for Biomarker Research and Personalized Medicine
>>>>>>> School of Pharmacy
>>>>>>> Virginia Commonwealth University
>>>>>>> McGuire Hall, Room 216B
>>>>>>> 1112 East Clay Street
>>>>>>> Richmond, VA 23298
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Daniel E. Adkins, PhD
>>>>> Assistant Professor
>>>>> Center for Biomarker Research and Personalized Medicine
>>>>> School of Pharmacy
>>>>> Virginia Commonwealth University
>>>>> McGuire Hall, Room 216B
>>>>> 1112 East Clay Street
>>>>> Richmond, VA 23298
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Daniel E. Adkins, PhD
>>> Assistant Professor
>>> Center for Biomarker Research and Personalized Medicine
>>> School of Pharmacy
>>> Virginia Commonwealth University
>>> McGuire Hall, Room 216B
>>> 1112 East Clay Street
>>> Richmond, VA 23298
>>>
>>
>
>
>
> --
> Daniel E. Adkins, PhD
> Assistant Professor
> Center for Biomarker Research and Personalized Medicine
> School of Pharmacy
> Virginia Commonwealth University
> McGuire Hall, Room 216B
> 1112 East Clay Street
> Richmond, VA 23298
>
--
Daniel E. Adkins, PhD
Assistant Professor
Center for Biomarker Research and Personalized Medicine
School of Pharmacy
Virginia Commonwealth University
McGuire Hall, Room 216B
1112 East Clay Street
Richmond, VA 23298
More information about the R-sig-mixed-models
mailing list