[R] Competing risk regression with CRR slow on large datasets?
Max Gordon
dr.max.gordon at gmail.com
Wed Jul 20 22:18:53 CEST 2011
Hi,
I posted this question on stats.stackexchange.com 3 days ago but the
answer didn't really address my question concerning the speed in
competing risk regression. I hope you don't mind me asking it in this
forum:
I’m doing a registry based study with almost 200 000 observations and
I want to perform a competing risk analysis. My problem is that the
crr() in the cmprsk package is exponentially increasing with
increasing number of observations. I therefore wrote a simulation for
trying different approaches; check how factors, data frames and
matrixes affect the performance so that I could choose the most
efficient combination.
I have a 1 year old computer with 8 Gb of RAM and still it didn’t
finish 70 000 observations when I left the computer overnight.
My main questions:
- Is there a faster way of performing competing risk analysis?
- Why does Win7 4 times perform better? Is it the 64-bit version
that improves the performance?
- Can I do something to speed things up?
- Is the simulation similarly slow on your computer? (see simulation
code at the end)
Thanks Max
To see the output and the simulation code see the original question at
http://stats.stackexchange.com/questions/13151/competing-risk-regression-with-crr-slow-on-large-datasets
More information about the R-help
mailing list