[R] Random Forests: Predictor importance for Regression Trees

Liaw, Andy andy_liaw at merck.com
Tue Apr 21 14:08:46 CEST 2009


Yes, you've got it!

Cheers,
Andy 

From: Behalf Of Dimitri 
> 
> Hello!
> 
> I think I am relatively clear on how predictor importance (the first
> one) is calculated by Random Forests for a Classification tree:
> 
> Importance of predictor P1 when the response variable is categorical:
> 
> 1. For out-of-bag (oob) cases, randomly permute their values on
> predictor P1 and then put them down the tree
> 2. For a given tree, subtract the number of votes for the correct
> class in the predictor-P1-permuted oob dataset from the number of
> votes for the correct class in the untouched oob dataset: if P1 is
> important, this number will be large.
> 3. The average of this number over all trees in the forest is the raw
> importance score for predictor P1.
> 
> I am wondering what step 2 above looks like if the response variable
> is continous and not categorical, in other words - for a Regression
> tree. Could you please correct if what I wrote below is wrong? Thank
> you very much!
> 
> Importance of predictor P1 when the response variable is continous:
> 
> 1. For out-of-bag (oob) cases, randomly permute their values on
> predictor P1 and then put them down the tree
> 2. For a given tree, calculate mean squared deviation of observed y
> minus predicted y for (a) the untouched oob dataset and for (b) the
> predictor-P1-permuted oob dataset. Subtract (a) from (b).
> 3. The average of this number over all trees in the forest is the raw
> importance score for predictor P1.
> 
> -- 
> Dimitri Liakhovitski
> MarketTools, Inc.
> Dimitri.Liakhovitski at markettools.com
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}




More information about the R-help mailing list