[R] tuning SVM's

David Meyer david.meyer at wu-wien.ac.at
Wed Dec 1 13:49:22 CET 2004


Stephen:

Your calls to best.svm() do not tune anything unless you specify the
parameter ranges (see the examples on the help page). Your calls are
just using the defaults which are very unlikely to yield models with
good performance.

[I think some day, I will have to remove the defaults in svm()...]

Another point: why aren't you using classification machines (which is
done automatically by providing a factor as dependent variable)?

There is classAgreement() in e1071, too, you might want to look at.

Cheers,
David






Hi  
 
I am doing this  sort of thing:
 
POLY:
 
> > obj = best.tune(svm, similarity ~., data = training, kernel =
"polynomial")
> summary(obj)
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = "polynomial") 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  polynomial 
       cost:  1 
     degree:  3 
      gamma:  0.04545455 
     coef.0:  0 
    epsilon:  0.1 
 
 
Number of Support Vectors:  754
 
> svm.model <- svm(similarity ~., data = training, kernel  =
"polynomial", cost = 1, degree = 3, gamma = 0.04545455, coef.0 = 0,
epsilon = 0.1)
> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0  
> table(testing$similarity, pred)
   pred
    0  1 
  0 30  8
  1 70 63
> obj = best.tune(svm, similarity ~., data = training, kernel =
"linear")
> summary(obj)
 
LINEAR:
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = "linear") 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  linear 
       cost:  1 
      gamma:  0.04545455 
    epsilon:  0.1 
 
 
Number of Support Vectors:  697
 
> svm.model <- svm(similarity ~., data = training, kernel  = "linear",
cost = 1, gamma = 0.04545455, epsilon = 0.1)
> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0  
> table(testing$similarity, pred)
   pred
    0   1  
  0   6  32
  1   4 129
 
 
RADIAL:
 
> obj = best.tune(svm, similarity ~., data = training, kernel =
"radial")
> summary(obj)
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = "linear") 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  linear 
       cost:  1 
      gamma:  0.04545455 
    epsilon:  0.1 
 
 
Number of Support Vectors:  697
 
> svm.model <- svm(similarity ~., data = training, kernel  = "radial",
cost = 1, gamma = 0.04545455, epsilon = 0.1)
> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0  
> table(testing$similarity, pred)
   pred
    0  1 
  0 27 11
  1 64 69
 
 
SIGMOID:
 
> obj = best.tune(svm, similarity ~., data = training, kernel =
"sigmoid")
> summary(obj)
 
Call:
 best.tune(svm, similarity ~ ., data = training, kernel = "sigmoid") 
 
Parameters:
   SVM-Type:  eps-regression 
 SVM-Kernel:  sigmoid 
       cost:  1 
      gamma:  0.04545455 
     coef.0:  0 
    epsilon:  0.1 
 
 
Number of Support Vectors:  986
 
> svm.model <- svm(similarity ~., data = training, kernel  = "sigmoid",
cost = 1, gamma = 0.04545455, coef.0 = 0, epsilon = 0.1)
> pred=predict(svm.model, testing)
> pred[pred > .5] = 1
> pred[pred <= .5] = 0  
> table(testing$similarity, pred)
   pred
    0   1  
  0   8  30
  1  26 107
>
 
and then taking out the kappa statistic to see if I am getting anything
significant.
 
I get kappas of 15 - 17% - I don't think that is very good.  I know
kappa is really for comparing the outcomes of two taggers but it seems a
good way to measure if your results might be by chance.
 
Two questions:
 
Any comments on Kappa and what it might be telling me?
 
What can I do to tune my kernels further?
 
Stephen
-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/




More information about the R-help mailing list