[R-sig-DCM] number of iterations

Wirth, Ralph (GfK SE) ralph.wirth at gfk.com
Wed Feb 23 18:56:24 CET 2011


I think Dimitris problem is due to the huge amount of the respondents. Their draws are generated in the inner (i.e. the second) loop. Loops are slow in R, and apply doesn't help much. I tried to vetorize the inner loop one day, but then I got problems with RAM.

If you know the C programming language, then you could write the loops in C and use this C code within R. This should speed up your calculations A LOT (peope told me that estimations can easily become 10 times faster).

As to iterations: I usually let my R function do more than 100000 iterations, then look at the matplots and either use the draws after convergence for calculating the point estimates or (if convergence hasn't occured yet) continue the MCMC algorithm.

Regards,
Ralph



----- Originalnachricht -----
Von: r-sig-dcm-bounces at r-project.org <r-sig-dcm-bounces at r-project.org>
An: Dimitri Liakhovitski <dimitri.dcm at gmail.com>
Cc: R DCM List <r-sig-dcm at r-project.org>
Gesendet: Wed Feb 23 18:47:03 2011
Betreff: Re: [R-sig-DCM] number of iterations

Wow -- your problem is large.  I should have noted that my runs were with binary logit and much smaller samples (usually only a dozen or so choices for N~200).  Still, I run on a 64-bit Xeon machine with 16GB RAM.  Runs of many hours are usual -- I think the max was about 60 hours (but I didn't time it and it was over a weekend).  I haven't done direct comparison, but would *guess* that bayesm is 25-100x slower than Sawtooth's CBC/HB.  I know some academics who say that they code real problems in Fortran ...

One thing I wonder about bayesm is whether there are obvious optimizations that would be easy to do.  (For instance, in some of my own MNL code, I got a speedup of 7x (!!) simply by replacing two calls of "apply(x,1,sum)" with "rowSums(x)")  Might be worth a quick look ...



From: Dimitri Liakhovitski
Sent: Wednesday, February 23, 2011 4:17 PM
To: Chris Chapman
Cc: R DCM List
Subject: Re: [R-sig-DCM] number of iterations


Yes, I've taken Greg's tutorial - but I would not say it was of much use to practitioners, and especially it had nothing to do with DCM...

Wow, 50k! 100k!
I've just done a DCM in bayesm (rhierMnlRwMixture) - 4 attributes, total # of levels 17, 7 tasks but a very large sample size (~4,200). I also had a categorical covariate with 8 levels, i.e., I had 7 dummy-coded centered covariates. It refused to run on my laptop (ran out of memory), but it did run on my powerful 64-bit Windows 7 PC (R 12.2 - for 64 bits). It did not run out of memory, but I've done 21K iterations in total and it took me 9.5 hours (!).

Dimitri



On Wed, Feb 23, 2011 at 11:05 AM, Chris Chapman <cnchapman at msn.com> wrote:

  Hi Dimitri --

  For my part, yes, I think it all depends :-)  The usual recommendation is to run it "quite a while" (50k+ iterations) and inspect the convergence of the estimates (i.e., plot the draws and see if there are approximately horizontal lines after a certain number of iterations, with no "blow ups" of individual lines or major crossovers among them).

  Personally, I tend to start with 100k iterations (only because I like round numbers) and take beta draws every 10 of the final 20k.  If it doesn't converge but looks plausible (not all over the place), then I try 200k.  If it still doesn't converge, I'll decide what to do based on how bad the convergence plots look.  That's assuming something like 6-8 attributes and 30-40 total levels in a CBC model.

  (BTW, for a better answer ... Greg Allenby [one of the authors of bayesm] offers tutorials at ART Forum most years that go into the general bayesm approach in substantial depth.  I'd bet you've taken that already, though :-)

  -- Chris

  --------------------------------------------------
  From: "Dimitri Liakhovitski" <dimitri.dcm at gmail.com>
  Sent: Wednesday, February 23, 2011 3:37 PM
  To: "R DCM List" <r-sig-dcm at r-project.org>
  Subject: [R-sig-DCM] number of iterations


    Question for those who have done HB to assess DCM utilities in bayesm:

    I know, this question is too general and it all depends on the nature of the
    DCM at hand, # of attributes, # of levels, etc.
    But in general: when you run HB in bayesm, how many iterations do you run in
    total and how many do you use to grab your beta draws from?

    Thank you!
    Dimitri


    [[alternative HTML version deleted]]

    _______________________________________________
    R-SIG-DCM mailing list
    R-SIG-DCM at r-project.org
    https://stat.ethz.ch/mailman/listinfo/r-sig-dcm



        [[alternative HTML version deleted]]

_______________________________________________
R-SIG-DCM mailing list
R-SIG-DCM at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-dcm


GfK SE, Nuremberg, Germany, commercial register Nuremberg HRB 25014; Management Board: Professor Dr. Klaus L. Wübbenhorst (CEO), Pamela Knapp (CFO), Dr. Gerhard Hausruckinger, Petra Heinlein, Debra A. Pruent, Wilhelm R. Wessels; Chairman of the Supervisory Board: Dr. Arno Mahlert
This email and any attachments may contain confidential or privileged information. Please note that unauthorized copying, disclosure or distribution of the material in this email is not permitted.


More information about the R-SIG-DCM mailing list