[R] MART(tm) vs. gbm
manuel martin
mam2071 at med.cornell.edu
Fri May 26 15:43:34 CEST 2006
Hello,
I tried two different implementations of the stochastic gradient
boosting (Friedman 2002) : the MART(tm) with R tool
(http://www-stat.stanford.edu/~jhf/R-MART.html) and the gbm R package.
To me, both seemed fairly comparable, except maybe regarding the
different loss criterion proposed and the fact that the gbm tool is
sightly more convenient to use. However, it seemed that the MART with R
tool systematically outperforms the gbm tool in terms of goodness of fit
(whatever the way of choosing the best iteration for the gbm package).
I tried to find out if there were specific options that could have
explained it but nothing came out. See below for an example of how I
compare both implementations. Did any one had the same experience, and
can anyone give me hints about such performance differences or tell me
if I am missing something obvious?
Thank you in advance, Manuel
Here are the arguments and options I used for comparison purposes,
working on a 1600 records * 15 variables dataset :
# the MART with R tool
lx <-
mart( as.matrix(x), y, c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2)
niter=1000, tree.size=6, learn.rate=0.01,
loss.cri=2 # gaussian
)
# for gbm
gbm1 <- gbm(y ~ v1 + v2 + v3 + v4 + v5+ v6+ v7+ v8 + v9 + v10 + v11 +
v12 + v13 + v14 + v15,
data=data, var.monotone=c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
distribution="gaussian", n.trees=1000, shrinkage=0.01,
interaction.depth=6, bag.fraction = 0.5, train.fraction = 0.5,
n.minobsinnode = 10, cv.folds = 1, keep.data=TRUE)
# I then do predictions on the same dataset, and further perform
goodness of fit comparisons
#...
More information about the R-help
mailing list