[BioC] mtry in randomForest
Liaw, Andy
andy_liaw at merck.com
Fri Jul 2 15:34:29 CEST 2004
The problem is that the importance measure in randomForest is based on
random permutations of the data, so even with the same mtry, multiple runs
of randomForest will give you different importance measures, and thus
rankings. This is especially true for small number of samples. People have
tried using 10000-30000 or even more trees to get a more stable ranking of
importance measures in RF.
HTH,
Andy
> From: Liu, Xin
>
> Hello group,
>
> I created an Expression Set as the following:
>
> Expression Set (exprSet) with
> 4986 genes
> 12 samples
> phenoData object with 2 variables and 12 cases
> varLabels
> cov1: Genotype
> cov2: Treatment
>
> then I run the Rondom Forest:
> rf <- randomForest(Xm, trainY, ntree = 2000, mtry = 70,
> importance = TRUE)
> var.imp.plot(rf, n.var = 30)
>
> The problem is when I chose different mtry, such as mtry =
> 70, mtry = 65, and mtry = 60, I got totally different gene
> lists with importance. Really get confusing.
>
> Any suggestions? Thank you.
>
> Xin LIU
>
>
>
> This e-mail is from ArraGen Ltd\ \ The e-mail and any files
> ...{{dropped}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>
>
More information about the Bioconductor
mailing list