[R] SVM cross validation in e1071

Steve Lianoglou mailinglist.honeypot at gmail.com
Wed Jul 8 03:43:49 CEST 2009


Hi Tao,

On Jul 7, 2009, at 8:33 PM, Tao Shi wrote:

> Hi list,
>
> Could someone help me to explain why the leave-one-out cross  
> validation results I got from svm using the internal option "cross"  
> are different from those I got manually?  It seems using "cross" to  
> do cross validation, the results are always better.  Please see the  
> code below.  I also include lda as a comparison.

Looking at the C code in Rsvm.c, it looks like the model that is  
returned is one that is trained on *all* of the data that is  
originally passed in.

After the model is built, and the value for cross is > 1, the  
`do_cross_validation` function is called, in which your data is then  
split into folds for cross validation. This is only done to report  
accuracy or MSE (depending on classification vs. regression). The  
models from this CV do not effect the model that is returned back to R.

So ... that's why. If you train your svm without holding out any data  
(and do no cross validation), you should essentially get back the same  
model that you're getting back no when you set cross > 1.

Does that make sense?

-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

Contact Info: http://cbio.mskcc.org/~lianos




More information about the R-help mailing list