[R-sig-ME] problems with allocate memory

cumuluss at web.de cumuluss at web.de
Thu Dec 22 20:05:28 CET 2011


Hi Douglas,
maybe you mentioned the mail from Andrew. He already tested it but not with the newest CXXR version. I’m not really skilled enough I think, but perhaps I will try to work with CXXR and lme4 to check this out. Probably I will get stranded but on the other hand I’m already stranded.
I considered my model more carefully but as I wrote before I reduced it a lot without any interaction, no nonlinear term and only six fixed effects. But I was not able to open the results. I don’t see that a more reduced model will tell me something.
It is clear that I have to think about it again and I hope it leads me to a solution.
Thank you very much for your effort.
All the Best
Paul


> On Wed, Dec 21, 2011 at 6:22 PM,  <cumuluss at web.de> wrote:
> > Hi Douglas,
> > Ok, those are not the expected beautiful news :'(. If I understand it
> right Andrews implementation needs your check first before it is usably with
> lme4.
>
> I believe Andrew checks his CXXR builds against several packages so
> perhaps he has already tried to install lme4.  I am just suggesting
> the possibility that CXXR may be more effective in memory usage - I
> don't know and really don't want to stop other development to check it
> out.
>
> Basically you are going to need to consider your model more carefully.
>  The fact that you can specify a model with dozens of fixed-effects
> parameters doesn't mean it will be meaningful.
>
> > And please one more questions what would be a rough time estimation for
> a new R-core with 64-bit indices for atomic R vectors?
>
> That hasn't been established yet.  The current plan is to make some
> changes and see what breaks then decide if it is even feasible to try
> to patch everything back together.
>
> > Thank you very much for your help and suggestions.
> > All the Best
> > Paul
> >
> >
> > -------- Original-Nachricht --------
> >> Datum: Wed, 21 Dec 2011 17:10:49 -0600
> >> Von: Douglas Bates <bates at stat.wisc.edu>
> >> An: cumuluss at web.de
> >> CC: r-sig-mixed-models at r-project.org, a.r.runnalls at kent.ac.uk
> >> Betreff: Re: [R-sig-ME] problems with allocate memory
> >
> >> On Wed, Dec 21, 2011 at 5:03 PM,  <cumuluss at web.de> wrote:
> >> > Hi Douglas,
> >> >
> >> > thank you for your reply. But it sounds not that good for me. Could
> you
> >> please suggest me something what I could do more or maybe different.
> You
> >> said: There are no simple solutions at present. Is there a complicated
> >> available which I could try?
> >> > In the third part of your answer where you mentioned Andrew Runnalls
> and
> >> the “reimplementing of R” Could this also be helpful for my non
> >> fitting models problem or is this for opening the model results issue
> only?
> >>
> >> By "no easy solutions" I mean that I can't think of any approach that
> >> doesn't involve reimplementing the code from scratch, which would
> >> definitely take a long time.  Consider how long we have been working
> >> on getting a 1.0 version of lme4 :-)
> >>
> >> I am not sure if Andrew's CXXR implementation of R would be more
> >> effective or not.  I think it has a better garbage collection scheme
> >> but I haven't tried it and I don't know if lme4 would build in that
> >> system.  I have an item on the "ToDo" list to try it but I have a lot
> >> of items on the "ToDo" list.
> >>
> >> Basically you will need to use a simpler model or fit to a sample of
> >> your data or wait for R-core to determine if it is possible to use
> >> 64-bit indices for atomic R vectors.
> >>
> >> > -------- Original-Nachricht --------
> >> >> Datum: Wed, 21 Dec 2011 15:47:36 -0600
> >> >> Von: Douglas Bates <bates at stat.wisc.edu>
> >> >> An: cumuluss at web.de
> >> >> CC: r-sig-mixed-models at r-project.org, "a.r.runnalls"
> >> <a.r.runnalls at kent.ac.uk>
> >> >> Betreff: Re: [R-sig-ME] problems with allocate memory
> >> >
> >> >> On Tue, Dec 20, 2011 at 5:25 PM,  <cumuluss at web.de> wrote:
> >> >> > Hi Douglas,
> >> >> >
> >> >> > The variable‘d’ has about 710 levels.
> >> >> >
> >> >> > For your other request I tried to fit the suggested model but it
> was
> >> not
> >> >> possible. I tried it with different approaches, first without any
> >> >> interactions and non nonlinear term. It fitted. The object size was
> >> about 731571664
> >> >> bytes. Then I successive made the model more complex. With one two
> way
> >> >> interaction, with one three way interaction or with the nonlinear
> term
> >> it was
> >> >> slightly the same as before. With five two way interaction always
> with
> >> the
> >> >> nonlinear term the object size went up to 1075643424 bytes. With one
> >> >> additional two way interaction the model won’t fit anymore with
> the
> >> known
> >> >> error.
> >> >>
> >> >> Which is an indication that the fixed-effects model matrix is
> getting
> >> >> to be too large.  There are no simple solutions at present.  You
> may
> >> >> find that some packages allow you to fit such large models by
> working
> >> >> with horizonal chunks of the data and accumulating the result but
> >> >> extending those to GLMMs would be decidedly non-trivial.
> >> >>
> >> >> > Perhaps another hint: Yesterday I attempted to fit a much simpler
> >> model
> >> >> with lmer, just to see if this works. (mfit=lmer(gr.b ~ f.ag + f.se
> +
> >> o.se
> >> >> + diff + exp.r + kl + (1|f)+(1|o), data=c.data, family=binomial)).
> It
> >> >> fitted but I could not open mfit. Trying to see only the
> coefficients
> >> also did
> >> >> not work. I saved the image and this one is unfamiliar huge about
> 1.7
> >> GB.
> >> >>
> >> >> The problem there is that the implicit print(mfit) (which is what I
> >> >> imagine you mean when you say "could not open") ends up taking
> copies
> >> >> of the whole object, which will eat up all your memory.
> Development
> >> >> versions of lme4 may eventually help with that.
> >> >>
> >> >> > By the way: After reloading the image some interesting things
> >> happened.
> >> >> An error occurred: slot coefs are not an S4 object. It seems to me
> that
> >> it
> >> >> is not possible to save the model results in an R image. Is that
> right?
> >> >>
> >> >> It should be possible to save and load such an object but there is
> >> >> always the problem that when you have an object that is a sizable
> >> >> fraction of the total available memory then you can get bitten if
> >> >> something behind the scenes happens to take a copy at some point in
> >> >> the calculation.  The original design in R for keeping track of
> when a
> >> >> copy must be made is not the greatest and, as a result, R is
> somewhat
> >> >> conservative when deciding whether or not to copy an object.
> Getting
> >> >> around that limitation would mean reimplementing R, more-or-less
> from
> >> >> scratch and Andrew Runnalls is the only person I know who is willing
> >> >> to embark on that.
> >> >>
> >> >> > -------- Original-Nachricht --------
> >> >> >> Datum: Tue, 20 Dec 2011 09:23:37 -0600
> >> >> >> Von: Douglas Bates <bates at stat.wisc.edu>
> >> >> >> An: cumuluss at web.de
> >> >> >> CC: r-sig-mixed-models at r-project.org
> >> >> >> Betreff: Re: [R-sig-ME] problems with allocate memory
> >> >> >
> >> >> >> On Mon, Dec 19, 2011 at 5:54 PM,  <cumuluss at web.de> wrote:
> >> >> >> > Hi Douglas,
> >> >> >> >
> >> >> >> > thanky you for your reply. This is "mydata"
> >> >> >> >
> >> >> >> > 'data.frame':   3909896 obs. of  19 variables:
> >> >> >> >  $ gr.b            : int  0 0 0 0 0 0 0 0 0 0 ...
> >> >> >> >  $ o.ag      : num  -0.651 -0.651 -0.651 -0.651 -0.651
> ...
> >> >> >> >  $ o.rar      : num  -0.935 -0.935 -0.935 -0.935 -0.935
> ...
> >> >> >> >  $ si       : num  0.299 0.299 0.299 0.299 0.299 ...
> >> >> >> >  $ f.ag       : num  -1.25 -1.36 -1.33 -1.26 -1.21 ...
> >> >> >> >  $ f.se       : Factor w/ 2 levels "F","M": 1 2 1 2 2 2
> 1 1
> >> 1
> >> >> 1
> >> >> >> ...
> >> >> >> >  $ o.se       : Factor w/ 2 levels "F","M": 1 1 1 1 1 1
> 1 1
> >> 1
> >> >> 1
> >> >> >> ...
> >> >> >> >  $ diff        : num  -0.536 -0.514 -0.521 -0.534
> -0.545
> >> ...
> >> >> >> >  $ exp.r           : num  -0.168 -0.168 -0.163
> -0.168
> >> >> -0.168
> >> >> >> ...
> >> >> >> >  $ f.rar      : num  -0.911 0.215 1.224 -1.086 1.107 ...
> >> >> >> >  $ f.si: num  1.0008 1.1583 0.0561 -0.4163 0.371 ...
> >> >> >> >  $ kl              : Factor w/ 3 levels
> >> >> "mat","nonkin",..:
> >> >> >> 1 2 2 2 3 2 1 2 2 2 ...
> >> >> >> >  $ sn          : Factor w/ 2 levels "BS","MS": 1 1 1
> 1 1
> >> 1
> >> >> 1 1
> >> >> >> 1 1 ...
> >> >> >> >  $ MP_y_n          : Factor w/ 2 levels "0","1": 2 2
> 2 2
> >> 2
> >> >> 2 2
> >> >> >> 2 2 2 ...
> >> >> >> >  $ ratio       : num  -0.0506 -0.0506 -0.0506 -0.0506
> >> -0.0506
> >> >> >> ...
> >> >> >> >  $ f           : Factor w/ 55 levels
> >> "0A0","0A1","0A2",..:
> >> >> 1
> >> >> >> 6 7 8 9 10 11 13 15 16 ...
> >> >> >> >  $ o           : Factor w/ 552 levels
> >> "","00T","00Z",..: 2
> >> >> 2
> >> >> >> 2 2 2 2 2 2 2 2 ...
> >> >> >> >  $ d            : int  9099 9099 9099 9099 9099
> 9099
> >> >> 9099
> >> >> >> 9099 9099 9099 ...
> >> >> >> >  $ MP              : num  6 6 6 6 5 4 6 6 6 6
> ...
> >> >> >> >
> >> >> >> >
> >> >> >> > formula for the model:
> >> >> >> > mfit=lmer(gr.b ~ o.ag + o.rar + si + ((f.ag + I(f.ag^2)) *
> (f.se *
> >> >> (o.se
> >> >> >> + diff + exp.r + f.rar + f.si + kl + sn + MP_y_n + ratio))) +
> >> >> >> (1|f)+(1|o)+(1|d) + offset(log(MP)), data=c.data,
> family=binomial)
> >> >> >>
> >> >> >> > I hope this is what you want to see. Thank you for your help.
> >> >> >>
> >> >> >> My guess is that the problem is with creating the fixed-effects
> >> model
> >> >> >> matrix, of which there could be several copies created during the
> >> >> >> evaluation and optimization of the deviance.
> >> >> >>
> >> >> >> Just as a test, could you fit the model for the fixed-effects
> only
> >> >> >> using glm and check on what the size of the model matrix is?
> >> >> >> Something like
> >> >> >>
> >> >> >> glm1 <- glm(gr.b ~ o.ag + o.rar + si + ((f.ag + I(f.ag^2)) *
> (f.se *
> >> >> >> (o.se + diff + exp.r + f.rar + f.si + kl + sn + MP_y_n + ratio)))
> +
> >> >> >> offset(log(MP)), data=c.data, family=binomial)
> >> >> >> object.size(model.matrix(glm1)
> >> >> >>
> >> >> >> Also, could you convert 'd' to a factor and run str again so we
> can
> >> >> >> learn how many levels there are?  Either that or send the result
> of
> >> >> >>
> >> >> >> length(unique(mydata$d))
> >> >> >>
> >> >> >>
> >> >> >> > -------- Original-Nachricht --------
> >> >> >> >> Datum: Mon, 19 Dec 2011 14:50:06 -0600
> >> >> >> >> Von: Douglas Bates <bates at stat.wisc.edu>
> >> >> >> >> An: cumuluss at web.de
> >> >> >> >> CC: r-sig-mixed-models at r-project.org
> >> >> >> >> Betreff: Re: [R-sig-ME] problems with allocate memory
> >> >> >> >
> >> >> >> >> On Sun, Dec 18, 2011 at 3:17 PM,  <cumuluss at web.de> wrote:
> >> >> >> >> > Hi to everyone,
> >> >> >> >>
> >> >> >> >> > I have been trying to fit a glmm with a binomial error
> >> structure.
> >> >> My
> >> >> >> >> model is a little bit complex. I have 8 continuous predictor
> >> >> variables
> >> >> >> one of
> >> >> >> >> them as nonlinear term, 5 categorical predictor variables with
> >> some
> >> >> >> >> three-way interactions between them. Additional I have 3
> random
> >> >> effects
> >> >> >> and one
> >> >> >> >> offset variable in the model. Number of obs is greater than
> >> >> 3million.
> >> >> >> >> > I’m working with the latest version of R 2.14.0 on a 64
> bit
> >> >> windows
> >> >> >> >> system with 8Gb ram.
> >> >> >> >> > Everything I tried (reducing model complexity, different
> 64bit
> >> PC
> >> >> >> with
> >> >> >> >> even more memory) nothing leads to a fitted model, always the
> >> Error
> >> >> >> occurs:
> >> >> >> >> cannot allocate vector of size 2GB.
> >> >> >> >> > Is there anything I can do? I would be very grateful for any
> >> >> >> commentary.
> >> >> >> >>
> >> >> >> >> You probably have multiple copies of some large objects
> hanging
> >> >> >> >> around.  Can you send us the output of
> >> >> >> >>
> >> >> >> >> str(myData)
> >> >> >> >>
> >> >> >> >> where 'myData' is the name of the model frame containing the
> data
> >> >> you
> >> >> >> >> are using and the formula for the model you are trying to fit?
> >> >> >> >
> >> >> >> > _______________________________________________
> >> >> >> > R-sig-mixed-models at r-project.org mailing list
> >> >> >> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> >> > ___________________________________________________________
> >> >> > SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
> >> >> > kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192
> >> > ___________________________________________________________
> >> > SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
> >> > kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192
> >
> > ___________________________________________________________
> > SMS schreiben mit WEB.DE FreeMail - einfach, schnell und
> > kostenguenstig. Jetzt gleich testen! http://f.web.de/?mc=021192




More information about the R-sig-mixed-models mailing list