[R] e1071/svm?

David Meyer david.meyer at ci.tuwien.ac.at
Thu Jan 10 13:57:06 CET 2002


The Glass Data I used in my exampe is a multi-class classification
problem, and (as I wrote in the last email), this is more complicated:

To handle k classes, k>2, svm trains all binary subclassifiers
(one-against-one-method) and then uses a voting mechanism to determine
the actual class.

Now, this means k(k-1)/2 classifiers, hence in principle k(k-1)/2 sets
of
SV's, coefficiants and rhos. These are stored in a compressed format:

1) Only one SV is stored in case it were used by several clasifiers. The
model$SV-matrix is ordered by classes, and you find the starting indices
by using nSV (number of SV's):

start <- c(1, cumsum(model$nSV))
start <- start[-length(start)]

sum(nSV) equals the total nr. of (distinct) SVs.

2) The coefficients of the SV's are stored in the model$coefs-matrix,
grouped by classes. Because the separating hyperplanes found by the
svm-algorithm has SV's on both sides, you will have two sets of
coefficients per binary classifier, and e.g., for 3 classes, you could
build a *block*-matrix like this for the classifiers (i,j)
[i,j=class numbers):

   j     0         1        2
i      
0        X     set (0,1)  set (0,2)
1    set (1,0)     X      set (1,2)
2    set (2,0) set (2,1)     X

where set(i,j) are the coefficients for the classifier (i,j), lying on
the side of class j.

Because there are no entries for (i,i), we can save the diagonal and
shift up the lower triangular matrix to get

   j     0         1        2
i      
0    set (1,0) set (0,1)  set (0,2)
1    set (2,0) set (2,1)  set (1,2)


Each set (.,j) has length nSV[j], so of course, there will be some
filling 0s in some sets.

model$coefs is the *transposed* of such a matrix, therefore for the
Glass Data which has 6 classes, you get 6-1=5 columns.

The coefficients of (i,j) start at

m$coefs[start[i],j]

and those of (j,i) at

m$coefs[start[j],i-1) .

The k(k-1)/2 rhos are just linearly stored in the vector m$rho.


The following code shows how to use this for prediction:

## Linear Kernel function
K <- function(i,j) crossprod(i,j)

predsvm <- function(object, newdata) {
  ## compute start-index
  start <- c(1, cumsum(object$nSV)+1)
  start <- start[-length(start)]

  ## compute kernel values
  kernel <- sapply (1:object$tot.nSV,
                    function (x) K(object$SV[x,], newdata))

  ## compute raw prediction for classifier (i,j)
  predone <- function (i,j) {
    ## ranges for class i and j:
    ri <- start[i] : (start[i] + object$nSV[i] - 1)
    rj <- start[j] : (start[j] + object$nSV[j] - 1)
    
    ## coefs for (i,j):
    coef1 <- object$coefs[ri, j-1]
    coef2 <- object$coefs[rj, i]

    ## return raw values:
    crossprod(coef1, kernel[ri]) + crossprod(coef2, kernel[rj])
  }

  ## compute votes for all classifiers
  votes <- rep(0,object$nclasses)
  c <- 0 # rho counter
  for (i in 1 : (object$nclasses - 1))
    for (j in (i + 1) : object$nclasses)
      if (predone(i,j) > object$rho[c <- c + 1])
        votes[i] <- votes[i] + 1
      else
        votes[j] <- votes[j] + 1

  ## return winner (index with max. votes)
  object$levels[which(votes %in% max(votes))[1]]
}




Alexander Skomorokhov wrote:
> 
> David,
> 
> Happy New Year. Thanks a lot for your answer.
> 
> I was successful with my artificial linear example. And with my artificial
> nonlinear example, also.
> But when I went to an example from your paper in "R News" I met a problem
> again.
> 
> >dim(m$SV)
> [1] 156 9
> 
> That's okay. I expect 156 coefs to calculate prediction. But what I have is:
> 
> >dim(m$coefs)
> [1] 156 5
> 
> Why 5??? Kernel function (default radial basis) gives me a scalar. For one
> new data point I will get
> vector of 156 items (Kernel functions of new point and each support vector).
> I have to find sum of
> these items with weights of coefs and sign after minus rho. But I have 5
> times more coefs than I need.
> 
> Please, clarify this points.
> 
> Regards,
> Alexander.
> 
> > -----Original Message-----
> > From: meyer at ci.tuwien.ac.at [mailto:meyer at ci.tuwien.ac.at]On
> > Behalf Of David Meyer
> > Sent: Thursday, January 03, 2002 3:27 PM
> > To: askom at obninsk.com; r-help
> > Subject: Re: [R] e1071/svm?
> >
> >
> > OK, after having inspected the code of libsvm, I agree it is not
> > completely obvious to use the results produced by svm() for prediction.
> >
> > For class prediction in the binary case, the class of a new data vector
> > ``n'' is usually given by *the sign* of
> >
> > Sum(a_i * y_i * K(x_i, n)) + rho
> >  i
> >
> > where x_i is the i-th support vector, y_i the corresponding label, a_i
> > the corresponding coefficiant, and K is the kernel (in your case, the
> > linear one, so
> > K(u,v) = u'v).
> >
> >
> > Now, ``libsvm'' actually returns a_i * y_i as i-th coefficiant and the
> > *negative* rho, so in fact uses the formula:
> >
> > Sum(coef_i * K(x_i, n)) - rho
> >  i
> >
> > where the training examples (=training data) are labeled {1,-1}.
> >
> >
> > A simplified R function for prediction with linear kernel would be:
> >
> > svmpred <- function (m, newdata, K=crossprod) {
> >   ## this guy does the computation:
> >   pred.one <- function (x)
> >     sign(sum(sapply(1:m$tot.nSV, function (j)
> >                     K(m$SV[j,], x) * m$coefs[j]
> >                     )
> >              ) - m$rho
> >          )
> >
> >   ## this is just for convenience:
> >   if (is.vector(newdata))
> >     newdata <- t(as.matrix(x))
> >   sapply (1:nrow(newdata),
> >           function (i) pred.one(newdata[i,]))
> > }
> >
> >
> > where ``pred.one'' does the actual prediction for one new data vector,
> > the rest is just a convenience for prediction of multiple new examples.
> >
> > It's easy to extend this to other kernels, just replace ``K'' with the
> > appropriate function [see the help page for the formulas used] and
> > supply the additional constants.
> >
> > Note, however, that multi-class prediction is more complicated, because
> > the coefficiants of the diverse binary svm's are stored in a compressed
> > format.
> >
> > And finally, you could directly use the C source code of libsvm, if your
> > target language supports foreign function calls.
> >
> > regards,
> >
> > -d
> >
> >
> > Alexander Skomorokhov wrote:
> > >
> > > Thank you fot your reply. Sorry, but I still haven't got the problem.
> > > Here is a trivial example (cpy from R session):
> > >
> > ------------------------------------------------------------------
> > ----------
> > > ------------------
> > > > x
> > >      [,1] [,2]
> > > [1,]    0    0
> > > [2,]    0    1
> > > [3,]    1    0
> > > [4,]    1    1
> > > [5,]    2    2
> > > [6,]    2    3
> > > [7,]    3    2
> > > [8,]    3    3
> > > > y
> > > [1] 1 1 1 1 2 2 2 2
> > > Levels:  1 2
> > > > is.factor(y)
> > > [1] TRUE
> > > > library(e1071)
> > > > m<-svm(x,y,kernel='linear')
> > > *
> > > optimization finished, #iter = 3
> > > nu = 0.250000
> > > obj = -1.000000, rho = -3.000000
> > > nSV = 2, nBSV = 2
> > > Total nSV = 2
> > >
> > > > summary(m)
> > > Call:
> > >  svm.default(x = x, y = y, kernel = "linear")
> > > Parameters:
> > >    SVM-Type:  C-classification
> > >  SVM-Kernel:  linear
> > >        cost:  1
> > >      degree:  3
> > >       gamma:  0.5
> > >      coef.0:  0
> > >          nu:  0.5
> > >     epsilon:  0.5
> > >        cost:  1
> > > Number of Support Vectors:  2  ( 1 1 )
> > > Number of Classes:  2
> > > Levels:
> > >  1 2
> > > Rho:
> > >  -3
> > > Support Vectors:
> > >      [,1] [,2]
> > > [1,]    1    1
> > > [2,]    2    2
> > > Coefficiants:
> > >      [,1]
> > > [1,]    1
> > > [2,]   -1
> > >
> > ------------------------------------------------------------------
> > ----------
> > > ----------
> > > I see Coefficiants, but can't guess how to use them (only them?) for
> > > prediction???
> > > I may guess that the question sounds like a stupid one, but it
> > is that:-(
> > >
> > > Thanks,
> > > Alexander.
> > >
> > > > -----Original Message-----
> > > > From: meyer at ci.tuwien.ac.at [mailto:meyer at ci.tuwien.ac.at]On
> > > > Behalf Of David Meyer
> > > > Sent: Thursday, December 20, 2001 3:03 PM
> > > > To: askom at obninsk.com
> > > > Cc: r-help
> > > > Subject: Re: [R] e1071/svm?
> > > >
> > > >
> > > > Alexander Skomorokhov wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > I use function "svm" (interface to libsvm) from package e1071.
> > > > It works just
> > > > > fine.
> > > > > And I may predict with function "predict" and svm model trained
> > > > by function
> > > > > "svm".
> > > > > What I need is moving results of svm training to another
> > > > application (non-R)
> > > > > and
> > > > > perform prediction there. But function "svm" returns list of
> > > > support vectors
> > > > > only
> > > > > and doesn't return coefficients of separating hyperplane (w).
> > > >
> > > > It does (element ``coefs'' of the returned object).
> > > >
> > > > g.
> > > > -d
> > > >
> > > > >
> > > > > So, the question is how to use results of svm training to write
> > > > (in other
> > > > > language)
> > > > > prediction function for linear and nonlinear cases?
> > > > >
> > > > > Thanks,
> > > > > Alexander.
> > > > >
> > > > >
> > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> > > > -.-.-.-.-.-.-
> > > > > r-help mailing list -- Read
> > > http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > > > Send "info", "help", or "[un]subscribe"
> > > > (in the "body", not the subject !)  To:
> > r-help-request at stat.math.ethz.ch
> > > >
> > >
> > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
> > _._._._._.
> > > _._
> > >
> > > --
> > >         Mag. David Meyer                Wiedner Hauptstrasse 8-10
> > > Vienna University of Technology         A-1040 Vienna/AUSTRIA
> > >        Department for                   Tel.: (+431) 58801/10772
> > > Statistics and Probability Theory       mail:
> > david.meyer at ci.tuwien.ac.at
> > >
> > >
> > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> > -.-.-.-.-.-.-
> > > r-help mailing list -- Read
> http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> > Send "info", "help", or "[un]subscribe"
> > (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> >
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
> _._
> 
> --
>         Mag. David Meyer                Wiedner Hauptstrasse 8-10
> Vienna University of Technology         A-1040 Vienna/AUSTRIA
>        Department for                   Tel.: (+431) 58801/10772
> Statistics and Probability Theory       mail: david.meyer at ci.tuwien.ac.at

-- 
	Mag. David Meyer		Wiedner Hauptstrasse 8-10
Vienna University of Technology		A-1040 Vienna/AUSTRIA
       Department for			Tel.: (+431) 58801/10772
Statistics and Probability Theory	mail: david.meyer at ci.tuwien.ac.at
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list