[R] Random Forests: Predictor importance for Regression Trees
Liaw, Andy
andy_liaw at merck.com
Tue Apr 21 14:08:46 CEST 2009
Yes, you've got it!
Cheers,
Andy
From: Behalf Of Dimitri
>
> Hello!
>
> I think I am relatively clear on how predictor importance (the first
> one) is calculated by Random Forests for a Classification tree:
>
> Importance of predictor P1 when the response variable is categorical:
>
> 1. For out-of-bag (oob) cases, randomly permute their values on
> predictor P1 and then put them down the tree
> 2. For a given tree, subtract the number of votes for the correct
> class in the predictor-P1-permuted oob dataset from the number of
> votes for the correct class in the untouched oob dataset: if P1 is
> important, this number will be large.
> 3. The average of this number over all trees in the forest is the raw
> importance score for predictor P1.
>
> I am wondering what step 2 above looks like if the response variable
> is continous and not categorical, in other words - for a Regression
> tree. Could you please correct if what I wrote below is wrong? Thank
> you very much!
>
> Importance of predictor P1 when the response variable is continous:
>
> 1. For out-of-bag (oob) cases, randomly permute their values on
> predictor P1 and then put them down the tree
> 2. For a given tree, calculate mean squared deviation of observed y
> minus predicted y for (a) the untouched oob dataset and for (b) the
> predictor-P1-permuted oob dataset. Subtract (a) from (b).
> 3. The average of this number over all trees in the forest is the raw
> importance score for predictor P1.
>
> --
> Dimitri Liakhovitski
> MarketTools, Inc.
> Dimitri.Liakhovitski at markettools.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Notice: This e-mail message, together with any attachme...{{dropped:12}}
More information about the R-help
mailing list