[R] Competing risk regression with CRR slow on large datasets?

Max Gordon dr.max.gordon at gmail.com
Wed Jul 20 22:18:53 CEST 2011


Hi,

I posted this question on stats.stackexchange.com 3 days ago but the
answer didn't really address my question concerning the speed in
competing risk regression. I hope you don't mind me asking it in this
forum:

I’m doing a registry based study with almost 200 000 observations and
I want to perform a competing risk analysis. My problem is that the
crr() in the cmprsk package is exponentially increasing with
increasing number of observations. I therefore wrote a simulation for
trying different approaches; check how factors, data frames and
matrixes affect the performance so that I could choose the most
efficient combination.

I have a 1 year old computer with 8 Gb of RAM and still it didn’t
finish 70 000 observations when I left the computer overnight.

My main questions:

 -  Is there a faster way of performing competing risk analysis?
 -  Why does Win7 4 times perform better? Is it the 64-bit version
that improves the performance?
 -  Can I do something to speed things up?
 -  Is the simulation similarly slow on your computer? (see simulation
code at the end)

Thanks Max

To see the output and the simulation code see the original question at
http://stats.stackexchange.com/questions/13151/competing-risk-regression-with-crr-slow-on-large-datasets



More information about the R-help mailing list