[R] randomForest: predictor importance (for regressions)

Liaw, Andy andy_liaw at merck.com
Thu May 6 16:31:55 CEST 2010

> From: Dimitri Liakhovitski
> Thank you very much, Andy.
> I did turn off HTML - hope it'll solve the problem!

Indeed it does!
> > [AL]: As I said, you are recommended to use importance() to extract 
> > variable importance.  The recommendation is for avoiding confusions 
> > like yours.  If you want to know what the components in the objects 
> > give you, compare to what the extractor function returns, 
> you can look 
> > inside the extractor function to find out for yourself.  
> Really, I'm 
> > not trying to be difficult, but there are very good reasons for not 
> > accessing the components directly when extractor functions 
> exist.  If 
> > the underlying components are somehow changed in the 
> future, only the 
> > extractor functions are guaranteed to give you the "right 
> thing".  I 
> > added the extractor function for importance measures 
> precisely because 
> > the way they are computed changed.
> Andy, I'll explain why I am asking. I probably should have 
> done it in the beginning:
> I am asking not in order to figure out how to do it. I am 
> asking in order to figure something that' was done around 
> November 01, 2008.
> Back then, a piece of code was run where from the object of 
> randomForest(.... importance=T...) the importances 
> ($importance) were extracted (just by referring to 
> $importance) and the first column was used.
> Do you happen to know what they were back then? Standardized or not?

The change coincided with the introduction of the importanceSD component, due to the change in how the importance is measured.  The "importance" component are just mean(d[i]), and importanceSD are sd(d[i])/sqrt(ntree).  The importance() function by default (scale=TRUE) does the normalization, and that's what you should use.  Leo found that this normalization will greatly reduce the "bias" due to different number of possible splits in different predictors.


> Thank you!
> Dimitri
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

More information about the R-help mailing list