[R] Cross-validation accuracy in SVM

Liaw, Andy andy_liaw at merck.com
Thu Jan 20 20:59:30 CET 2005


The 99.7% accuracy you quoted, I take it, is the accuracy on the training
set.  If so, that number hardly means anything (other than, perhaps,
self-fulfilling prophecy).  Usually what one would want is for the model to
be able to predict data that weren't used to train the model with high
accuracy.  That's what cross-validation tries to emulate.  It gives you an
estimate of how well you can expect your model to do on data that the model
has not seen.

Andy

> From: Ton van Daelen
> 
> Hi all -
> 
> I am trying to tune an SVM model by optimizing the cross-validation
> accuracy. Maximizing this value doesn't necessarily seem to 
> minimize the
> number of misclassifications. Can anyone tell me how the
> cross-validation accuracy is defined? In the output below, 
> for example,
> cross-validation accuracy is 92.2%, while the number of correctly
> classified samples is (1476+170)/(1476+170+4) = 99.7% !?
> 
> Thanks for any help.
> 
> Regards - Ton
> 
> ---
> Parameters:
>    SVM-Type:  C-classification 
>  SVM-Kernel:  radial 
>        cost:  8 
>       gamma:  0.007 
> 
> Number of Support Vectors:  1015
> 
>  ( 148 867 )
> 
> Number of Classes:  2 
> 
> Levels: 
>  false true
> 
> 5-fold cross-validation on training data:
> 
> Total Accuracy: 92.24242 
> Single Accuracies:
>  90 93.33333 94.84848 92.72727 90.30303 
> 
> Contingency Table
>            predclasses
> origclasses false true
>       false 1476     0
>       true     4   170
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list