[R] how to replace my double for loop which is little efficient!
Berend Hasselman
bhh at xs4all.nl
Sun Dec 26 15:13:27 CET 2010
bbslover wrote:
>
> x: is a matrix 202*263, that is 202 samples, and 263 independent
> variables
>
> num.compd<-nrow(x); # number of compounds
> diss.all<-0
> for( i in 1:num.compd)
> for (j in 1:num.compd)
> if (i!=j) {
> S1<-sum(x[i,]*x[j,])
> S2<-sum(x[i,]^2)
> S3<-sum(x[j,]^2)
> sim2<-S1/(S2+S3-S1)
> diss2<-1-sim2
> diss.all<-diss.all+diss2}
>
> it will cost a long time to finish this computation! i really need "rapid"
> code to replace my code.
>
Alternative 1: j-loop only needs to start at i+1 so
for( i in 1:num.compd) {
for (j in seq(from=i+1,to=num.compd,length.out=max(0,num.compd-i))) {
S1<-sum(x[i,]*x[j,])
S2<-sum(x[i,]^2)
S3<-sum(x[j,]^2)
sim2<-S1/(S2+S3-S1)
diss2<-1-sim2
diss2.all<-diss2.all+diss2
}
}
diss2.all <- 2 * diss2.all
On my pc this is about twice as fast as your version (with 202 samples and
263 variables)
Alternative 2: all sum() are not necessary. Use some matrix algebra:
xtx <- x %*% t(x)
diss3.all <- 0
for( i in 1:num.compd) {
for (j in seq(from=i+1,to=num.compd,length.out=max(0,num.compd-i))) {
S1 <- xtx[i,j]
S2 <- xtx[i,i]
S3 <- xtx[j,j]
sim2<-S1/(S2+S3-S1)
diss2<-1-sim2
diss3.all<-diss3.all+diss2
}
}
diss3.all <- 2 * diss3.all
This is about four times as fast as alternative 1.
I'm quite sure that more expert R gurus can get some more speed up.
Note: I generated the x matrix with:
set.seed(1);x<-matrix(runif(202*263),nrow=202)
(Timings on iMac 2.16Ghz and using 64-bit R)
Berend
--
View this message in context: http://r.789695.n4.nabble.com/how-to-replace-my-double-for-loop-which-is-little-efficient-tp3164222p3164262.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list