[R] SVM cross validation in e1071
Steve Lianoglou
mailinglist.honeypot at gmail.com
Wed Jul 8 03:43:49 CEST 2009
Hi Tao,
On Jul 7, 2009, at 8:33 PM, Tao Shi wrote:
> Hi list,
>
> Could someone help me to explain why the leave-one-out cross
> validation results I got from svm using the internal option "cross"
> are different from those I got manually? It seems using "cross" to
> do cross validation, the results are always better. Please see the
> code below. I also include lda as a comparison.
Looking at the C code in Rsvm.c, it looks like the model that is
returned is one that is trained on *all* of the data that is
originally passed in.
After the model is built, and the value for cross is > 1, the
`do_cross_validation` function is called, in which your data is then
split into folds for cross validation. This is only done to report
accuracy or MSE (depending on classification vs. regression). The
models from this CV do not effect the model that is returned back to R.
So ... that's why. If you train your svm without holding out any data
(and do no cross validation), you should essentially get back the same
model that you're getting back no when you set cross > 1.
Does that make sense?
-steve
--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos
More information about the R-help
mailing list