[R] randomForest question--problem with ntree

Mary Putt mputt at mail.med.upenn.edu
Thu Aug 13 23:11:32 CEST 2009


Hi, 

I would like to use a random Forest model to get an idea about which variables from a dataset may have some prognostic significance in a smallish study. The default for the number of trees seems to be 500. I tried changing the default to ntree=2000 or ntree=200 and the results appear identical. Have changed mtry from mtry=5 to mtry=6 successfully. Have seen same problem on both a Windows machine and our linux system running 2.8 and 2.9. 

Sample code follws.

Thanks in advance for help. 

Mary


> m1<-as.formula(paste("as.factor(EAD)~", paste(names(clin_b)[c(5,7,10:36 )], collapse="+")))
> m1
as.factor(EAD) ~ R_AGE + R_BMI + ASCITES...1L. + EOTAXIN + GM.CSF + 
    IFNa + IL.10 + IL.12.p40.p70 + IL.13 + IL.15 + IL.17 + IL.2 + 
    IL.4 + IL.5 + IL.6 + IL.7 + IL.8 + IL1.RA + IL2.R + IP.10 + 
    MCP.1 + MIG + MIP.1a + MIP.1b + RANTES + TNFa + Male + diagnosis + 
    race
> 
> 
> 
> 
> set.seed(12345)
> rF.bsl<-randomForest(m1, data=clin_b, na.action=na.omit, mtry=6, n.tree=2000)
> rF.bsl$ntree
[1] 500
> rF.bsl$mtry
[1] 6
> print(rF.bsl)

Call:
 randomForest(formula = m1, data = clin_b, mtry = 6, n.tree = 2000,  na.action = na.omit) 
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 6

        OOB estimate of  error rate: 39.66%
Confusion matrix:
   0 1 class.error
0 27 7   0.2058824
1 16 8   0.6666667
> 
> 
> set.seed(12345)
> rF.bsl<-randomForest(m1, data=clin_b, na.action=na.omit, mtry=6, n.tree=100)
> rF.bsl$ntree
[1] 500
> rF.bsl$mtry
[1] 6
> print(rF.bsl)

Call:
 randomForest(formula = m1, data = clin_b, mtry = 6, n.tree = 100,      na.action = na.omit) 
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 6

        OOB estimate of  error rate: 39.66%
Confusion matrix:
   0 1 class.error
0 27 7   0.2058824
1 16 8   0.6666667
> 
>




More information about the R-help mailing list