[R] question about svm(e1071)

Mon Jan 17 10:06:15 CET 2011

Dear Prof. Ligges,

Thank you for the reply.
Is an order of calculation changed when samples are shuffled?
Does that happen because of Sequential Minimal Optimization(SMO)?

I noticed that when I set scale=F, SVs were identical.
However, differences between coefs are sometimes relatively large.

Best,

Hiro

### Script start ###

>set.seed(50)
>s <- sample(ncol(data))

>m   <- svm(x=t(data    ), y=factor(data.cl   ), scale=F, type="C-classification", kernel="linear")
>m.s <- svm(x=t(data[,s]), y=factor(data.cl[s]), scale=F, type="C-classification", kernel="linear")

> sum(abs(m$SV[order(rownames(m$SV)),] - m.s$SV[order(rownames(m.s$SV)),]))
[1] 0
> sum(abs(m$coefs[order(rownames(m$SV))] -m.s$coefs[order(rownames(m.s$SV))]))
[1] 0.3227749

### Script end ###

-----Original Message-----
From: Uwe Ligges [mailto:ligges �� statistik.tu-dortmund.de] 
Sent: Saturday, January 15, 2011 3:10 AM
To: 武藤裕紀(創薬資源研究部１ＧＢＩＯＩＮＦ)
Cc: r-help �� r-project.org
Subject: Re: [R] question about svm(e1071)

Looking at your results suggests that differences are probably based on 
expected minor numerical inaccuracies and the possibly alternating sign 
of the support vectors.

Best,
Uwe Ligges

On 13.01.2011 01:28, mutohrn �� chugai-pharm.co.jp wrote:
> Dear all,
>
> I executed svm calculation using e1071 library with a microarray data (http://www.iu.a.u-tokyo.ac.jp/‾kadota/R/data_Singh_RMA_3274.txt).
> Then, I shuffled the data samples and executed svm calculation again.
> The results of 2 calculation were different (in SV, coefs and weights).
>
> I attached the script below. Could please tell me why this happens?
> If possible please tell me how to make them equal.
>
> Best regards,
>
> Hiro
>
> ### Script start ###
>
> library(e1071)
> data<- read.table('http://www.iu.a.u-tokyo.ac.jp/‾kadota/R/data_Singh_RMA_3274.txt', header=TRUE, row.names=1, sep="¥t", quote="")
>
> data.cl<- rep(NA,ncol(data))
> data.cl[grep('Normal',colnames(data))]<- 'Normal'
> data.cl[grep('Tumour',colnames(data))]<- 'Tumour'
>
> s<- sample(ncol(data))
>
> m<- svm(x=t(data    ), y=factor(data.cl   ), scale=T, type="C-classification",kernel="linear")
> m.s<- svm(x=t(data[,s]), y=factor(data.cl[s]), scale=T, type="C-classification", kernel="linear")
>
> w<- t(m  $coefs) %*% m$SV
> w.s<- t(m.s$coefs) %*% m.s$SV
>
> # SV and coefs are slightly different
> sum(abs(m$SV[order(rownames(m$SV)),] - m.s$SV[order(rownames(m.s$SV)),]))
> sum(abs(m$coefs[order(rownames(m$SV))] -m.s$coefs[order(rownames(m.s$SV))]))
>
> # rank of weight are not identical
> all(rank(w)==rank(w.s))
>
> ### Script end ###
>
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help �� r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.