[R] Help with SVM package Kernlab

Steve Lianoglou mailinglist.honeypot at gmail.com
Fri Dec 25 06:01:34 CET 2009


Hi,

Comments in line:

On Thu, Dec 24, 2009 at 11:42 PM, Vishal Thapar <vishalthapar at gmail.com> wrote:
> Hi useR's,
>
> I am resending this request since I got no response for my last post and I
> am new to the list so pardon me if I am violating the protocol.
>
> I am trying to use the "Kernlab" package for training and prediction using
> SVM's. I am getting the following error when I am trying to use the predict
> function:
>
>> predictSvm = predict(modelforSVM, testSeq);
> Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") :   contrasts can
> be applied only to factors with 2 or more levels

It's hard to say without a reproducible example, but it looks like the
data you are sending into your predict function is different than what
the svm has seen in training. What do these commands return over your
data?

1. is(train500)
2. is(train500$class)
3. is(train500[1,5])
4. is(testSeq)
5. is(testSeq[1,5])

> The training file is a data frame with 501 columns: Col 1 is "Class" which
> is "+" or "-" and Cols V1 to V500 are "A/C/G/T" . There are 200 seq's for
> training (100 + and - each). this is very similar to the "promotergene" data
> set included as example with the package.

How similar are we talking -- something is (obviously) off because
using the promotergene dataset is quite straightforward:

library(kernlab)
data(promotergene)
tr <- promotergene[1:90,]
ts <- promotergene[91:106,]
m <- ksvm(Class~., data=promotergene, kernel="rbfdot", kpar =
"automatic", C = 60, cross = 3, prob.model = TRUE)
p <- predict(m, ts)

>
> The model that I have generated is as follows:
>
> modelforSVM <- ksvm(Class ~ ., data = train500, kernel = "rbfdot", kpar =
> "automatic", C = 60, cross = 3, prob.model = TRUE)
>
> The testSeq is a vector of 500 characters casted as a data.frame.

What does that mean, exactly? How did you do that?

Can't you just start with all of your data in a data.frame and "cut
out" the training and testing data.frames like I did above with the
promotorgene data (see the tr and ts vars)

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact




More information about the R-help mailing list