[R] Gradient Boosting Trees with correlated predictors in gbm

Max Kuhn mxkuhn at gmail.com
Tue Mar 2 21:10:37 CET 2010


On Tue, Mar 2, 2010 at 2:43 PM, Liaw, Andy <andy_liaw at merck.com> wrote:
> In most implementations of boosting, and for that matter, single tree,
> the first variable wins when there are ties.

They must be in a union :-)

>> What happens if there's a third?

If they were P perfectly correlated predictors, the importance would
would be 100% for the first one encountered by gbm. In reality, where
the correlation is strong but not perfect, the other variables would
show up with small importances. In the case of RF, the "dilution
factor" is 1/P for perfect correlations and gets fuzzier as the
correlation decreases (for reasons that Andy articulated).

-- 

Max



More information about the R-help mailing list