[Rd] Using log() on an openMosix cluster
Roger D. Peng
rpeng at jhsph.edu
Fri Nov 21 16:50:56 MET 2003
Hi all, I was hoping to get some advice about a problem that I realize
will be difficult to reproduce for some people. I'm running R 1.7.1 on
an openMosix (Linux) cluster and have been experiencing some odd
slow-downs. If anyone has experience with such a setup (or a similar
one) I'd appreciate any help. Here's a simplified version of the problem.
I'm trying to run the following code:
##
N <- 100000; a <- numeric(N); b <- numeric(N)
e <- rnorm(N)
for(i in 1:N) {
a[i] <- exp(e[i])
b[i] <- log(abs(a[i]))
}
##
When I run it on the head node, everything is fine. However, when I
send the R process off to one of the cluster nodes (i.e. using mosrun
from the head node) the program takes about 10 times longer (in
wall-clock time, cpu time is roughly the same).
Interestingly, when I tried running the following code:
##
N <- 100000; a <- numeric(N); b <- numeric(N)
e <- rnorm(N)
for(i in 1:N) {
a[i] <- exp(e[i])
b[i] <- exp(abs(a[i]))
}
##
I didn't experience any slow-down! That is the wall-clock time is the
same when run on the head node or on the cluster nodes. The only
difference between the two programs is that one takes a log in the for()
loop and the other one takes an exponential.
I guess my question is why would taking the log() produce a 10 fold
increase in runtime? I know that Mosix clusters can experience serious
performance hits if you make a lot of system calls or write out data to
files but I don't think I'm doing that here. Is there some major
difference in the way that exp() and log() are implemented?
I'm pretty sure this isn't an R problem but I'm wondering if R is doing
something behind the scenes that's affecting performance in the
openMosix setting.
Thanks in advance for any help.
-roger
More information about the R-devel
mailing list