[R] Problem with SVM and scaling

Fri Dec 17 13:47:00 CET 2004

Ton:

Does preprocessing (scaling, removing constant variables, etc.) "by
hand" of the whole data set *before* splitting resolve things?

You will need the same variable structure in the training and the test
set anyway; scaling is just the first code part that fails on your
data...

g,
-d

-----

Hi all -
_
I am running into a problem with the SVM() method when applying it to
data sets that have descriptors with zero variance. Here is the sequence
of events:

1. I split my data set with 512 descriptors in a training and test set
2. I build an SVM model for the training set. Out of 512 descriptors,
500 have zero variance which I discard before calling the SVM method
3. For the test set, 8 descriptors have zero variance, which I discard
too
4. predict.svm() then fails, because it tries to scale using two vectors
of different size (500 and 504)

Is there a way to get around this?

-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/~meyer/