[R] RWeka cross-validation and Weka_control Parametrization
Kurt Hornik
Kurt.Hornik at wu-wien.ac.at
Tue Aug 14 10:54:54 CEST 2007
> On Wed, 01 Aug 2007 10:52:02 +0200, Bjoern wrote:
> Hello,
> I have two questions concerning the RWeka package:
> 1.) First question:
> How can one perform a cross validation, -say 10fold- for a given
> data set and given model ?
> 2.) Second question
> What is the correct syntax for the parametrization of e.g. Kernel
> classifiers interface
> m1 <- SMO(Species ~ ., data = iris, control =
> Weka_control(K="weka.classifiers.functions.supportVector.RBFKernel",G=0.1))
> m2 <- SMO(Species ~ ., data = iris, control =
> Weka_control(K="weka.classifiers.functions.supportVector.RBFKernel",G=1.0))
>> m1
> SMO
> Kernel used:
> RBF kernel: K(x,y) = e^-(0.01* <x-y,x-y>^2)
> ## should be: RBF kernel: K(x,y) = e^-(0.1* <x-y,x-y>^2)
> etc.
The answer for question 2 is surprisingly simple, but nevertheless took
me about half an hour to find:
m2 <- SMO(Species ~ ., data = iris,
control = Weka_control(K = "weka.classifiers.functions.supportVector.RBFKernel -G 2"))
gives
R> m2
SMO
Kernel used:
RBF kernel: K(x,y) = e^-(2.0* <x-y,x-y>^2)
[Using Weka_control(K = ..., G = ...) passes the G option to SMO but not
RBFKernel. The docs for SMO() say
-K <classname and parameters>
The Kernel to use.
(default: weka.classifiers.functions.supportVector.PolyKernel)
and one needs to remember Weka's command line style interface to realize
that this deparses into putting everything into a string for the K
option.]
This is of course not quite what R users would expect, and we'll try to
improve the Weka control mechanism so that specifying (Weka class)
options which require additional parameters becomes more convenient.
Best
-k
More information about the R-help
mailing list