[R] use "caret" to rank predictors by random forest model
mxkuhn
mxkuhn at gmail.com
Mon Mar 14 18:45:35 CET 2011
Xiaoqi,
You need to specify the sizes. There are other search algorithms that auotmatically pick the size (such as genetic algorithms), but I don't have those in the package yet.
Another approach is to use univariate filtering (see the sbf function in caret).
Max
On Mar 13, 2011, at 8:49 PM, Xiaoqi Cui <xcui at mtu.edu> wrote:
> Thanks for your prompt reply!
>
> You're right, I didn't add the parameter "importance=TRUE" when I used function "train" to fit the random forest model. Once I used the above parameter, everything went well. Also the functions "varImp" and "plot" work well too.
>
> I noticed "caret" is really good at selecting important predictors. Here I just have another question about using the package "caret" to select the best subset of predictors. As I know, the function "rfe" can be used to select the optimal set of important predictors given a series of sizes of the subsets. I'm wondering if "caret" can automatically give the best size of the selected subset without user providing the candidate sizes. Thanks,
>
> Best,
>
> Xiaoqi
> ----- Original Message -----
> From: "Max Kuhn" <mxkuhn at gmail.com>
> To: "Xiaoqi Cui" <xcui at mtu.edu>
> Cc: r-help at r-project.org
> Sent: Monday, March 7, 2011 2:33:06 PM GMT -06:00 US/Canada Central
> Subject: Re: [R] use "caret" to rank predictors by random forest model
>
> It would help if you provided the code that you used for the caret functions.
>
> The most likely issues is not using importance = TRUE in the call to train()
>
> I believe that I've only implemented code for plotting the varImp
> objects resulting from train() (eg. there is plot.varImp.train but not
> plot.varImp).
>
> Max
>
> On Mon, Mar 7, 2011 at 3:27 PM, Xiaoqi Cui <xcui at mtu.edu> wrote:
>> Hi,
>>
>> I'm using package "caret" to rank predictors using random forest model and draw predictors importance plot. I used below commands:
>>
>> rf.fit<-randomForest(x,y,ntree=500,importance=TRUE)
>> ## "x" is matrix whose columns are predictors, "y" is a binary resonse vector
>> ## Then I got the ranked predictors by ranking "rf1$importance[,"MeanDecreaseAccuracy"]"
>> ## Then draw the importance plot
>> varImpPlot(rf.fit)
>>
>> As you can see, all the functions I used are directly from the package "randomForest", instead of from "caret". so I'm wondering if the package "caret" has some functions who can do the above ranking and ploting.
>>
>> In fact, I tried functions "train", "varImp" and "plot" from package "caret", the random forest model that built by "train" can not be input correctly to "varImp", which gave error message like "subscripts out of bounds". Also function "plot" doesn't work neither.
>>
>> So I'm wondering if anybody has encountered the same problem before, and could shed some light on this. I would really appreciate your help.
>>
>> Thanks,
>> Xiaoqi
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
>
> Max
More information about the R-help
mailing list