[R] how to evaluate the significance of attributes in tree gr owing
Liaw, Andy
andy_liaw at merck.com
Thu Jan 27 02:42:51 CET 2005
FWIW, I wrote a little function to extract variable importance as defined in
the CART book a while ago. It's rather limited: Only works for regression
problem, and you need to set maxsurrogate=0 and maxcompete=0. It may (or
may not) help you:
varimp.rpart <- function(x) {
dev <- x$frame[, c("var", "dev")]
dev <- dev[dev$var != "<leaf>", ]
improve <- x$split[, "improve"]
imp <- tapply(dev[, 2] * improve, dev$var, sum)[-1]
if (any(is.na(imp)))
imp[is.na(imp)] <- 0
imp
}
Here's an example using the Boston housing data:
> library(rpart)
> data(Boston, package="MASS")
> boston.rp <- rpart(medv ~ ., Boston, control=rpart.control(maxsurrogate=0,
maxcompete=0))
> varimp.rpart(boston.rp)
crim zn indus chas nox rm age
dis
1136.809 0.000 0.000 0.000 0.000 23825.922 0.000
1544.804
rad tax ptratio black lstat
0.000 0.000 0.000 0.000 7988.955
Both gbm and randomForest has analogous measures.
Andy
> From: WeiWei Shi
>
> Hi, there:
>
> I am wondering if there is a package in R (doing decison trees) which
> can provide some methods to evaluate the significance of attributes. I
> remembered randomForest gives some output like that. Unfortunately my
> current computing env. cannot handle my datasets if I use
> randomForest. So, I am thinking if other packages can do this job or
> not.
>
>
> Thanks,
>
> Ed
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>
More information about the R-help
mailing list