[R] SVM Param Tuning with using SNOW package

cls59 chuck at sharpsteen.net
Wed Nov 18 06:50:08 CET 2009



raluca wrote:
> 
> Hello,
> 
> Is the first time I am using SNOW package and I am trying to tune the cost
> parameter for a linear SVM, where the cost (variable cost1) takes 10
> values between 0.5 and 30.
> 
> I have a large dataset and a pc which is not very powerful, so I need to
> tune the parameters using both CPUs of the pc.
> 
> Somehow I cannot manage to do it. It seems that both CPUs are fitting the
> model for the same values of cost1, I guess the first 5, but not for the
> last 5.
> 
> Please, can anyone help me!
> 
> Here is the code:  
> 
> data <- data.frame(Y=I(Y),X=I(X))
> data.X<-data$X
> data.Y<-data$Y
> 
> 


Helping you will be difficult as we're only three lines into your example
and already I have no idea what the data you are using looks like.  Example
code needs to be fully reproducible-- that means a small slice of
representative data needs to be provided or faked using an appropriate
random number generator.  

Some things did jump out at me about your approach and I've made some notes
below.



raluca wrote:
> 
> NR=10
> cost1=seq(0.5,30, length=NR)
> 
> sv.lin<- function(cl,c) {
> 
> for (i in 1:NR) {
> 
> ind=sample(1:414,276)
> 
> hogTest<-  data.frame(Y=I(data.Y[-ind]),X=I(data.X[-ind,])) 
> hogTrain<- data.frame(Y=I(data.Y[ind]),X=I(data.X[ind,])) 
> 
> svm.lin   	  <- svm(hogTrain$X,hogTrain$Y, kernel="linear",cost=c[i],
> cross=5)
> results.lin   <- predict(svm.lin, hogTest$X)
> 
> e.test.lin     <- sqrt(sum((results.lin-hogTest$Y)^2)/length(hogTest$Y))
> 
> return(e.test.lin)
> }
> }
> 
> cl<- makeCluster(10, type="SOCK" ) 
> 


If your machine has two cores, why are you setting up a cluster with 10
nodes?  Usually the number of nodes should equal the number of cores on your
machine in order to keep things efficient.



raluca wrote:
> 
> 
> clusterEvalQ(cl,library(e1071))
> 
> clusterExport(cl,c("data.X","data.Y","NR","cost1")) 
> 
> RMSEP<-clusterApplyLB(cl,cost1,sv.lin)
> 


Are you sure this evaluation even produces results? sv.lin() is a function
you defined above that takes two parameters-- "cl" and "c". clusterApplyLB()
will feed values of cost1 into sv.lin() for the argument "cl", but it has
nothing to give for "c".  At the very least, it seems like you would need
something like:

  RMSEP <- clusterApplyLB( cl, cost1, sv.lin, c = someVector )



raluca wrote:
> 
> 
> stopCluster(cl)
> 
> 


Sorry I can't be very helpful, but with no data and no apparent way to
legally call sv.lin() the way you have it set up, I can't investigate the
problem to see if I get the same results you described.  If you could
provide a complete working example, then there's a better chance that
someone on this list will be able to help you.

Good luck!

-Charlie

-----
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: http://old.nabble.com/SVM-Param-Tuning-with-using-SNOW-package-tp26399401p26402903.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list