[Rd] Using log() on an openMosix cluster

Roger D. Peng rpeng at jhsph.edu
Fri Nov 21 16:50:56 MET 2003


Hi all, I was hoping to get some advice about a problem that I realize 
will be difficult to reproduce for some people.  I'm running R 1.7.1 on 
an openMosix (Linux) cluster and have been experiencing some odd 
slow-downs.  If anyone has experience with such a setup (or a similar 
one) I'd appreciate any help.  Here's a simplified version of the problem.

I'm trying to run the following code:
##
N <- 100000; a <- numeric(N); b <- numeric(N)
e <- rnorm(N)

for(i in 1:N) {
         a[i] <- exp(e[i])
         b[i] <- log(abs(a[i]))
}
##

When I run it on the head node, everything is fine.  However, when I 
send the R process off to one of the cluster nodes (i.e. using mosrun 
from the head node) the program takes about 10 times longer (in 
wall-clock time, cpu time is roughly the same).

Interestingly, when I tried running the following code:
##
N <- 100000; a <- numeric(N); b <- numeric(N)
e <- rnorm(N)

for(i in 1:N) {
         a[i] <- exp(e[i])
         b[i] <- exp(abs(a[i]))
}
##

I didn't experience any slow-down!  That is the wall-clock time is the 
same when run on the head node or on the cluster nodes.  The only 
difference between the two programs is that one takes a log in the for() 
loop and the other one takes an exponential.

I guess my question is why would taking the log() produce a 10 fold 
increase in runtime?  I know that Mosix clusters can experience serious 
performance hits if you make a lot of system calls or write out data to 
files but I don't think I'm doing that here.  Is there some major 
difference in the way that exp() and log() are implemented?

I'm pretty sure this isn't an R problem but I'm wondering if R is doing 
something behind the scenes that's affecting performance in the 
openMosix setting.

Thanks in advance for any help.

-roger



More information about the R-devel mailing list