[R] case study of efficient R coding
Jun Yan
jyan at stat.wisc.edu
Tue Oct 15 22:35:43 CEST 2002
This is an interesting case study. Method 3 and 4 only differ in how the
indexes are obtained. Method 3 uses seq, which uses the primitive function
c. Method 4 evaluates p^2 logical expression in R. When p is small, method
4 can beat method 3. But as p gets bigger, the time usage by method 3 only
slightly increases, while the time usage by method 4 increasely
dramatically.
test <- function(nsim, p) {
z1<- system.time(for (i in 1:nsim) seq(1, by=(p+1), length=p))
mat <- matrix(1, p, p)
z2 <- system.time(for (i in 1:nsim) row(mat) == col(mat))
rbind(seq=z1, rowcol=z2)
}
> test(1000, 200)
[,1] [,2] [,3] [,4] [,5]
seq 0.21 0 0.21 0 0
rowcol 9.16 0 9.16 0 0
> test(1000, 100)
[,1] [,2] [,3] [,4] [,5]
seq 0.17 0.01 0.19 0 0
rowcol 2.06 0.00 2.05 0 0
> test(1000, 10)
[,1] [,2] [,3] [,4] [,5]
seq 0.16 0.00 0.16 0 0
rowcol 0.04 0.01 0.05 0 0
On Tue, 15 Oct 2002, Jason Liao wrote:
> The following are four ways I coded for the variance of multinomial
> distribution and their CPU time. It is a bit surprising that the 3rd
> method beats the 4th one substantially.
>
>
> var.multinomial1 = function(n, pi)
> + {
> + p = length(pi);
> + var = array(0, c(p,p));
> +
> + for(i in 1:p)
> + for(j in 1:p)
> + {
> + if(i==j) var[i,j] = n*pi[i]*(1-pi[1])
> + else var[i,j] = - n*pi[i]*pi[j];
> + }
> + }
> >
> > var.multinomial2 = function(n, pi)
> + {
> + n*( -pi%*%t(pi) + diag(pi) )
> + }
> >
> > var.multinomial3 = function(n, pi)
> + {
> + var = -pi%*%t(pi);
> + var[row(var)==col(var)] = pi*(1-pi);
> + n*var;
> + }
> >
> > var.multinomial4 = function(n, pi)
> + {
> + var = -pi%*%t(pi);
> + var[seq(1, by=(p+1), length=p)] = pi*(1-pi);
> + n*var;
> + }
> >
> > n = 100;
> > pi = exp( rnorm(10));
> > pi = pi/sum(pi);
> > n.simu = 1000;
> >
> > system.time(for(i in 1:n.simu) var = var.multinomial1(n, pi))
> [1] 4.50 0.04 99.73 NA NA
> > system.time(for(i in 1:n.simu) var = var.multinomial2(n, pi))
> [1] 0.33 0.00 10.74 NA NA
> > system.time(for(i in 1:n.simu) var = var.multinomial3(n, pi))
> [1] 0.14 0.00 4.17 NA NA
> > system.time(for(i in 1:n.simu) var = var.multinomial4(n, pi))
> [1] 0.21 0.00 6.47 NA NA
> >
> >
> >
> But first, does anyone know a vectorized way of coding (summing the
> outter product of rows)
>
> p = 10;
> x = rnorm(p*p);
> dim(x) = c(p.p);
>
> var = array(0, c(p,p));
> for(i in 1:p) var = var + x[i,]%*% t(x[i,]);
>
> =====
> Jason G. Liao, Ph.D.
> Division of Biometrics
> University of Medicine and Dentistry of New Jersey
> 335 George Street, Suite 2200
> New Brunswick, NJ 08903-2688
> phone (732) 235-8611, fax (732) 235-9777
> http://www.geocities.com/jg_liao
>
> __________________________________________________
>
> Faith Hill - Exclusive Performances, Videos & More
>
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list