[R-sig-ME] problems with allocate memory

cumuluss at web.de cumuluss at web.de
Thu Dec 22 01:22:14 CET 2011


Hi Douglas,
Ok, those are not the expected beautiful news :'(. If I understand it right Andrews implementation needs your check first before it is usably with lme4. And please one more questions what would be a rough time estimation for a new R-core with 64-bit indices for atomic R vectors?
Thank you very much for your help and suggestions.
All the Best
Paul


-------- Original-Nachricht --------
> Datum: Wed, 21 Dec 2011 17:10:49 -0600
> Von: Douglas Bates <bates at stat.wisc.edu>
> An: cumuluss at web.de
> CC: r-sig-mixed-models at r-project.org, a.r.runnalls at kent.ac.uk
> Betreff: Re: [R-sig-ME] problems with allocate memory

> On Wed, Dec 21, 2011 at 5:03 PM,  <cumuluss at web.de> wrote:
> > Hi Douglas,
> >
> > thank you for your reply. But it sounds not that good for me. Could you
> please suggest me something what I could do more or maybe different. You
> said: There are no simple solutions at present. Is there a complicated
> available which I could try?
> > In the third part of your answer where you mentioned Andrew Runnalls and
> the “reimplementing of R” Could this also be helpful for my non
> fitting models problem or is this for opening the model results issue only?
>
> By "no easy solutions" I mean that I can't think of any approach that
> doesn't involve reimplementing the code from scratch, which would
> definitely take a long time.  Consider how long we have been working
> on getting a 1.0 version of lme4 :-)
>
> I am not sure if Andrew's CXXR implementation of R would be more
> effective or not.  I think it has a better garbage collection scheme
> but I haven't tried it and I don't know if lme4 would build in that
> system.  I have an item on the "ToDo" list to try it but I have a lot
> of items on the "ToDo" list.
>
> Basically you will need to use a simpler model or fit to a sample of
> your data or wait for R-core to determine if it is possible to use
> 64-bit indices for atomic R vectors.
>
> > -------- Original-Nachricht --------
> >> Datum: Wed, 21 Dec 2011 15:47:36 -0600
> >> Von: Douglas Bates <bates at stat.wisc.edu>
> >> An: cumuluss at web.de
> >> CC: r-sig-mixed-models at r-project.org, "a.r.runnalls"
> <a.r.runnalls at kent.ac.uk>
> >> Betreff: Re: [R-sig-ME] problems with allocate memory
> >
> >> On Tue, Dec 20, 2011 at 5:25 PM,  <cumuluss at web.de> wrote:
> >> > Hi Douglas,
> >> >
> >> > The variable‘d’ has about 710 levels.
> >> >
> >> > For your other request I tried to fit the suggested model but it was
> not
> >> possible. I tried it with different approaches, first without any
> >> interactions and non nonlinear term. It fitted. The object size was
> about 731571664
> >> bytes. Then I successive made the model more complex. With one two way
> >> interaction, with one three way interaction or with the nonlinear term
> it was
> >> slightly the same as before. With five two way interaction always with
> the
> >> nonlinear term the object size went up to 1075643424 bytes. With one
> >> additional two way interaction the model won’t fit anymore with the
> known
> >> error.
> >>
> >> Which is an indication that the fixed-effects model matrix is getting
> >> to be too large.  There are no simple solutions at present.  You may
> >> find that some packages allow you to fit such large models by working
> >> with horizonal chunks of the data and accumulating the result but
> >> extending those to GLMMs would be decidedly non-trivial.
> >>
> >> > Perhaps another hint: Yesterday I attempted to fit a much simpler
> model
> >> with lmer, just to see if this works. (mfit=lmer(gr.b ~ f.ag + f.se +
> o.se
> >> + diff + exp.r + kl + (1|f)+(1|o), data=c.data, family=binomial)). It
> >> fitted but I could not open mfit. Trying to see only the coefficients
> also did
> >> not work. I saved the image and this one is unfamiliar huge about 1.7
> GB.
> >>
> >> The problem there is that the implicit print(mfit) (which is what I
> >> imagine you mean when you say "could not open") ends up taking copies
> >> of the whole object, which will eat up all your memory.  Development
> >> versions of lme4 may eventually help with that.
> >>
> >> > By the way: After reloading the image some interesting things
> happened.
> >> An error occurred: slot coefs are not an S4 object. It seems to me that
> it
> >> is not possible to save the model results in an R image. Is that right?
> >>
> >> It should be possible to save and load such an object but there is
> >> always the problem that when you have an object that is a sizable
> >> fraction of the total available memory then you can get bitten if
> >> something behind the scenes happens to take a copy at some point in
> >> the calculation.  The original design in R for keeping track of when a
> >> copy must be made is not the greatest and, as a result, R is somewhat
> >> conservative when deciding whether or not to copy an object.  Getting
> >> around that limitation would mean reimplementing R, more-or-less from
> >> scratch and Andrew Runnalls is the only person I know who is willing
> >> to embark on that.
> >>
> >> > -------- Original-Nachricht --------
> >> >> Datum: Tue, 20 Dec 2011 09:23:37 -0600
> >> >> Von: Douglas Bates <bates at stat.wisc.edu>
> >> >> An: cumuluss at web.de
> >> >> CC: r-sig-mixed-models at r-project.org
> >> >> Betreff: Re: [R-sig-ME] problems with allocate memory
> >> >
> >> >> On Mon, Dec 19, 2011 at 5:54 PM,  <cumuluss at web.de> wrote:
> >> >> > Hi Douglas,
> >> >> >
> >> >> > thanky you for your reply. This is "mydata"
> >> >> >
> >> >> > 'data.frame':   3909896 obs. of  19 variables:
> >> >> >  $ gr.b            : int  0 0 0 0 0 0 0 0 0 0 ...
> >> >> >  $ o.ag      : num  -0.651 -0.651 -0.651 -0.651 -0.651 ...
> >> >> >  $ o.rar      : num  -0.935 -0.935 -0.935 -0.935 -0.935 ...
> >> >> >  $ si       : num  0.299 0.299 0.299 0.299 0.299 ...
> >> >> >  $ f.ag       : num  -1.25 -1.36 -1.33 -1.26 -1.21 ...
> >> >> >  $ f.se       : Factor w/ 2 levels "F","M": 1 2 1 2 2 2 1 1
> 1
> >> 1
> >> >> ...
> >> >> >  $ o.se       : Factor w/ 2 levels "F","M": 1 1 1 1 1 1 1 1
> 1
> >> 1
> >> >> ...
> >> >> >  $ diff        : num  -0.536 -0.514 -0.521 -0.534 -0.545
> ...
> >> >> >  $ exp.r           : num  -0.168 -0.168 -0.163 -0.168
> >> -0.168
> >> >> ...
> >> >> >  $ f.rar      : num  -0.911 0.215 1.224 -1.086 1.107 ...
> >> >> >  $ f.si: num  1.0008 1.1583 0.0561 -0.4163 0.371 ...
> >> >> >  $ kl              : Factor w/ 3 levels
> >> "mat","nonkin",..:
> >> >> 1 2 2 2 3 2 1 2 2 2 ...
> >> >> >  $ sn          : Factor w/ 2 levels "BS","MS": 1 1 1 1 1
> 1
> >> 1 1
> >> >> 1 1 ...
> >> >> >  $ MP_y_n          : Factor w/ 2 levels "0","1": 2 2 2 2
> 2
> >> 2 2
> >> >> 2 2 2 ...
> >> >> >  $ ratio       : num  -0.0506 -0.0506 -0.0506 -0.0506
> -0.0506
> >> >> ...
> >> >> >  $ f           : Factor w/ 55 levels
> "0A0","0A1","0A2",..:
> >> 1
> >> >> 6 7 8 9 10 11 13 15 16 ...
> >> >> >  $ o           : Factor w/ 552 levels
> "","00T","00Z",..: 2
> >> 2
> >> >> 2 2 2 2 2 2 2 2 ...
> >> >> >  $ d            : int  9099 9099 9099 9099 9099 9099
> >> 9099
> >> >> 9099 9099 9099 ...
> >> >> >  $ MP              : num  6 6 6 6 5 4 6 6 6 6 ...
> >> >> >
> >> >> >
> >> >> > formula for the model:
> >> >> > mfit=lmer(gr.b ~ o.ag + o.rar + si + ((f.ag + I(f.ag^2)) * (f.se *
> >> (o.se
> >> >> + diff + exp.r + f.rar + f.si + kl + sn + MP_y_n + ratio))) +
> >> >> (1|f)+(1|o)+(1|d) + offset(log(MP)), data=c.data, family=binomial)
> >> >>
> >> >> > I hope this is what you want to see. Thank you for your help.
> >> >>
> >> >> My guess is that the problem is with creating the fixed-effects
> model
> >> >> matrix, of which there could be several copies created during the
> >> >> evaluation and optimization of the deviance.
> >> >>
> >> >> Just as a test, could you fit the model for the fixed-effects only
> >> >> using glm and check on what the size of the model matrix is?
> >> >> Something like
> >> >>
> >> >> glm1 <- glm(gr.b ~ o.ag + o.rar + si + ((f.ag + I(f.ag^2)) * (f.se *
> >> >> (o.se + diff + exp.r + f.rar + f.si + kl + sn + MP_y_n + ratio))) +
> >> >> offset(log(MP)), data=c.data, family=binomial)
> >> >> object.size(model.matrix(glm1)
> >> >>
> >> >> Also, could you convert 'd' to a factor and run str again so we can
> >> >> learn how many levels there are?  Either that or send the result of
> >> >>
> >> >> length(unique(mydata$d))
> >> >>
> >> >>
> >> >> > -------- Original-Nachricht --------
> >> >> >> Datum: Mon, 19 Dec 2011 14:50:06 -0600
> >> >> >> Von: Douglas Bates <bates at stat.wisc.edu>
> >> >> >> An: cumuluss at web.de
> >> >> >> CC: r-sig-mixed-models at r-project.org
> >> >> >> Betreff: Re: [R-sig-ME] problems with allocate memory
> >> >> >
> >> >> >> On Sun, Dec 18, 2011 at 3:17 PM,  <cumuluss at web.de> wrote:
> >> >> >> > Hi to everyone,
> >> >> >>
> >> >> >> > I have been trying to fit a glmm with a binomial error
> structure.
> >> My
> >> >> >> model is a little bit complex. I have 8 continuous predictor
> >> variables
> >> >> one of
> >> >> >> them as nonlinear term, 5 categorical predictor variables with
> some
> >> >> >> three-way interactions between them. Additional I have 3 random
> >> effects
> >> >> and one
> >> >> >> offset variable in the model. Number of obs is greater than
> >> 3million.
> >> >> >> > I’m working with the latest version of R 2.14.0 on a 64 bit
> >> windows
> >> >> >> system with 8Gb ram.
> >> >> >> > Everything I tried (reducing model complexity, different 64bit
> PC
> >> >> with
> >> >> >> even more memory) nothing leads to a fitted model, always the
> Error
> >> >> occurs:
> >> >> >> cannot allocate vector of size 2GB.
> >> >> >> > Is there anything I can do? I would be very grateful for any
> >> >> commentary.
> >> >> >>
> >> >> >> You probably have multiple copies of some large objects hanging
> >> >> >> around.  Can you send us the output of
> >> >> >>
> >> >> >> str(myData)
> >> >> >>
> >> >> >> where 'myData' is the name of the model frame containing the data
> >> you
> >> >> >> are using and the formula for the model you are trying to fit?
> >> >> >
> >> >> > _______________________________________________
> >> >> > R-sig-mixed-models at r-project.org mailing list
> >> >> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> > ___________________________________________________________
> >> > SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
> >> > kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192
> > ___________________________________________________________
> > SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
> > kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192




More information about the R-sig-mixed-models mailing list