[R] Stepwise SVM Variable selection

Noah Silverman noah at smartmediacorp.com
Fri Jan 7 08:52:32 CET 2011


I'll give it a try,

Thanks!

-N


On 1/6/11 11:34 PM, Steve Lianoglou wrote:
> Hi,
>
> On Fri, Jan 7, 2011 at 2:10 AM, Noah Silverman<noah at smartmediacorp.com>  wrote:
>> I have a data set with about 30,000 training cases and 103 variable.
>>
>> I've trained an SVM (using the e1071 package) for a binary classifier {0,1}.
>>   The accuracy isn't great.
>>
>> I used a grid search over the C and G parameters with an RBF kernel to find
>> the best settings.
>>
>> I remember that for least squares, R has a nice stepwise function that will
>> try combining subsets of variables to find the optimal result.  Clearly,
>> this doesn't exist for SVMs as a built in function.
>>
>> As an experiment, I simply grabbed the first 50 variables and repeated the
>> training/grid search procedure.  The results were significantly better.
>>   Since the date is VERY noisy, my guess is that eliminating some of the
>> variables eliminated some noise that resulted in better results.
>>
>> With a grid of 100 parameter settings (10 for C, 10 for G) and 106
>> variables, trying every combination would be prohibitively time consuming.
>>
>> Can anyone suggest an approach to seek the ideal subset of variables for my
>> SVM classifier?
> Sounds like a job for the types of approaches found in the penalizedSVM package:
>
> http://cran.r-project.org/web/packages/penalizedSVM/index.html
>
> -steve
>



More information about the R-help mailing list