[R] Gradient Boosting Trees with correlated predictors in gbm
Max Kuhn
mxkuhn at gmail.com
Tue Mar 2 21:10:37 CET 2010
On Tue, Mar 2, 2010 at 2:43 PM, Liaw, Andy <andy_liaw at merck.com> wrote:
> In most implementations of boosting, and for that matter, single tree,
> the first variable wins when there are ties.
They must be in a union :-)
>> What happens if there's a third?
If they were P perfectly correlated predictors, the importance would
would be 100% for the first one encountered by gbm. In reality, where
the correlation is strong but not perfect, the other variables would
show up with small importances. In the case of RF, the "dilution
factor" is 1/P for perfect correlations and gets fuzzier as the
correlation decreases (for reasons that Andy articulated).
--
Max
More information about the R-help
mailing list