[R] randomForest question--problem with ntree
Mary Putt
mputt at mail.med.upenn.edu
Thu Aug 13 23:11:32 CEST 2009
Hi,
I would like to use a random Forest model to get an idea about which variables from a dataset may have some prognostic significance in a smallish study. The default for the number of trees seems to be 500. I tried changing the default to ntree=2000 or ntree=200 and the results appear identical. Have changed mtry from mtry=5 to mtry=6 successfully. Have seen same problem on both a Windows machine and our linux system running 2.8 and 2.9.
Sample code follws.
Thanks in advance for help.
Mary
> m1<-as.formula(paste("as.factor(EAD)~", paste(names(clin_b)[c(5,7,10:36 )], collapse="+")))
> m1
as.factor(EAD) ~ R_AGE + R_BMI + ASCITES...1L. + EOTAXIN + GM.CSF +
IFNa + IL.10 + IL.12.p40.p70 + IL.13 + IL.15 + IL.17 + IL.2 +
IL.4 + IL.5 + IL.6 + IL.7 + IL.8 + IL1.RA + IL2.R + IP.10 +
MCP.1 + MIG + MIP.1a + MIP.1b + RANTES + TNFa + Male + diagnosis +
race
>
>
>
>
> set.seed(12345)
> rF.bsl<-randomForest(m1, data=clin_b, na.action=na.omit, mtry=6, n.tree=2000)
> rF.bsl$ntree
[1] 500
> rF.bsl$mtry
[1] 6
> print(rF.bsl)
Call:
randomForest(formula = m1, data = clin_b, mtry = 6, n.tree = 2000, na.action = na.omit)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 6
OOB estimate of error rate: 39.66%
Confusion matrix:
0 1 class.error
0 27 7 0.2058824
1 16 8 0.6666667
>
>
> set.seed(12345)
> rF.bsl<-randomForest(m1, data=clin_b, na.action=na.omit, mtry=6, n.tree=100)
> rF.bsl$ntree
[1] 500
> rF.bsl$mtry
[1] 6
> print(rF.bsl)
Call:
randomForest(formula = m1, data = clin_b, mtry = 6, n.tree = 100, na.action = na.omit)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 6
OOB estimate of error rate: 39.66%
Confusion matrix:
0 1 class.error
0 27 7 0.2058824
1 16 8 0.6666667
>
>
More information about the R-help
mailing list