[R] How do I extract Random Forest Terms and Probabilities?

Thu Dec 5 18:01:00 CET 2013

Gents:

This discussion is now off-topic here, I believe. Please take it
private or move to somewhere else more appropriate (SO, maybe).

Cheers,
Bert

On Thu, Dec 5, 2013 at 8:32 AM, Lopez, Dan <lopez235 at llnl.gov> wrote:
> Hi Andy,
>
> I have used predict before and in fact when I do that to the train set I get a perfect model (i.e. ROC right angle curve in upper left quadrant) which just looks like it overfit the data.
> This is not the case with the test set were I get auc of .77.
>
> I wanted to attempt a couple of calibration techniques I learned from Max Kuhn's Applied Predictive Modeling book. He uses a train set to do this. But with what I have now with the train set there is nothing to calibrate.
>
> That's why I thought I would use the original probabilities from the randomForest model that was used to create fm$predicted (fm is my randomForest model).
>
> I am still fairly new at predictive modeling and it could be the case that maybe I am not understanding something basic here.
>
> Thanks.
> Dan
>
> -----Original Message-----
> From: Liaw, Andy [mailto:andy_liaw at merck.com]
> Sent: Monday, December 02, 2013 8:40 AM
> To: arun; R help; Lopez, Dan
> Subject: RE: [R] How do I extract Random Forest Terms and Probabilities?
>
> #2 can be done simply with predict(fmi, type="prob").  See the help page for predict.randomForest().
>
> Best,
> Andy
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of arun
> Sent: Tuesday, November 26, 2013 6:57 PM
> To: R help
> Subject: Re: [R] How do I extract Random Forest Terms and Probabilities?
>
>
>
> Hi,
> For the first part, you could do:
>
> fmi2 <- fmi
> attributes(fmi2$terms) <- NULL
> capture.output(fmi2$terms)
> #[1] "Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width"
>
> A.k.
>
> On Tuesday, November 26, 2013 3:55 PM, "Lopez, Dan" <lopez235 at llnl.gov> wrote:
> Hi R Experts,
>
> I need your help with two question regarding randomForest.
>
>
> 1.       When I run a Random Forest model how do I extract the formula I used so that I can store it in a character vector in a dataframe?
> For example the dataframe might look like this if I am running models using the IRIS dataset #ModelID,Type,
>
> #001,RF,Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width
>
> fmi<-randomForest(Species~.,iris,mtry=3,ntry=500)
> #I know one place where the information is in fmi$terms but not sure how to extract just the formula info. Or perhaps there is somewhere else in fmi that I could get this?
>
>
> 2.       How do I get the probabilities (probability-like values) from the model that was run? I know for the test set I can use predict. And I know to extract the classifications from the model I use fmi$predicted. But where are the probabilities?
>
>
> Dan
> Workforce Analyst
> HRIM - Workforce Analytics & Metrics
> LLNL
>
>
>     [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> Notice:  This e-mail message, together with any attachme...{{dropped:10}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374