[R] Can't seem to finish a randomForest.... Just goes and goes!
David L. Van Brunt, Ph.D.
dvanbrunt at well-wired.com
Mon Apr 5 01:44:02 CEST 2004
Playing with randomForest, samples run fine. But on real data, no go.
Here's the setup: OS X, same behavior whether I'm using R-Aqua 1.8.1 or the
Fink compile-of-my-own with X-11, R version 1.8.1.
This is on OS X 10.3 (aka "Panther"), G4 800Mhz with 512M physical RAM.
I have not altered the Startup options of R.
Data set is read in from a text file with "read.table", and has 46 variables
and 1,855 cases. Trying the following:
The DV is categorical, 0 or 1. Most of the IV's are either continuous, or
correctly read in as factors. The largest factor has 30 levels.... Only the
DV seems to need identifying as a factor to force class trees over
regresssion:
>Mydata$V46<-as.factor(Mydata$V46)
>Myforest.rf<-randomForest(V46~.,data=Mydata,ntrees=100,mtry=7,proximities=FALSE
, importance=FALSE)
5 hours later, R.bin was still taking up 75% of my processor. When I've
tried this with larger data, I get errors referring to the buffer (sorry,
not in front of me right now).
Any ideas on this? The data don't seem horrifically large. Seems like there
are a few options for setting memory size, but I'm not sure which of them
to try tweaking, or if that's even the issue.
More information about the R-help
mailing list