[R] svm regression/classification

Steve Lianoglou mailinglist.honeypot at gmail.com
Wed Dec 30 08:27:26 CET 2009


Hi Nancy,

2009/12/30 Nancy Adam <nancyadam84 at hotmail.com>:
> Hi steve,
>
> Thank you so much for your reply.
>
> I’m asking about the difference between two cases:
>
> 1) when I use svm in a regression system and
>
> 2) when I use svm in a classification system.
>
>  Is the code of using svm in these two cases the same?

Getting the `svm` function to perform classification vs. regression
can be controlled by setting its `type` parameter. The help page for
the function ?svm suggests that this is automatically picked depending
on what type of element your y vector is, eg. it defaults to
classification if your `y` is a vector of factors.

That having been said, you can set this parameter explicitly so that
you're sure of what the function is doing, eg:

## classification:
mymodel <- svm(myformula, data=mydata, type='C-classification')

## regression
mymodel <- svm(myformula, data=mydata, type='eps-regression')

> This is the code for a regression system:
>
> my_svm_model <- function(myformula, mydata, mytestdata)
>
>             {
>
>       mymodel <- svm(myformula, data=mydata)
>
>             mytest <- predict(mymodel, mytestdata)
>
>             error <- mytest - mytestdata[,1]
>
>             -sqrt(mean(error**2))
>
>             }

That's not really "code for a regression system" -- as I said above,
performing regression vs. classification depends on what type of
vector your `y` labels turns out to be, given your formula (unless you
explicitly set type='something').

It looks like your `my_svm_model` is a function that calculates (the
negative of) the root-mean-squared-error (why negative, btw?). This
performance calculation is appropriate for regression, but not for
classification. For classification you probably want to report the
accuracy of the labels, eg something like:

mytest <- predict(mymodel, mytestdata, type='C-classification')
accuracy <- sum(mytest == mytestdata[,1]) / length(mytest)

As I said in my earlier email, it's not really appropriate to try,
say, regression and report "accuracy" like as its defined for
classification.

I'm not sure if I'm answering your question, partly because I'm not
really sure what you're really asking, ie. I'm not sure if you're
confused as to whether or not you should be doing classification or
regression, or do you know which of the two you want to do but you
don't understand how to get `svm` to perform the one you want?

Please clarify the above point if you still need more help.

Hope that helps,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact




More information about the R-help mailing list