[R] long run time for loop operation & matrix fill
Bert Gunter
gunter.berton at gene.com
Thu Aug 7 23:52:30 CEST 2008
outer() trades off space for speed. It *does* vectorize calculations (=
perform the loops in the underlying C code).
The apply() family of functions (eapply,mapply and rapply are other base R
versions that you missed; there are others in packages) are basically just
efficiently written looping functions. They may or may not offer much
speedup over explicit loops. As you said, their greatest advantage is
elegance and code readability (as functional programming, rather than
procedural programming, constructs).
As you also said, vectorizing calculations is a central theme in R that
takes some getting used to. I know of no general prescriptions for how to do
it; I, too, am still learning.
Finally, please heed Roland's (and r-help's) advice: provide a small,
reproducible example if you want specific help.
-- Bert Gunter
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Roland Rau
Sent: Thursday, August 07, 2008 2:22 PM
To: rcoder
Cc: r-help at r-project.org
Subject: Re: [R] long run time for loop operation & matrix fill
Hi rcoder,
rcoder wrote:
> Hi everyone,
>
> I'm running some code containing an outer and inner loop, to fill cells in
a
> 2500x1500 results matrix. I left my program running overnight, and it was
> still running when I checked 17 hours later. I have tested the operation
on
> a smaller matrix and it executes fine, so I believe there is nothing wrong
> with the code. I was just wondering if this is normal program execution
> speed for such an operation on a P4 with 2GB RAM?
>
loops are not one of the strengths in R, I would say (At least not
explicit ones). This is why many books and manuals on R devote
considerable space on "the whole object view", vectorizing calculations,
and general strategies how to avoid loops in R.
I (we) don't know what your actual program is doing. Probably applying a
rather complicated function to each cell of your matrix?
I did this code:
mymatrix <- matrix(rep(0.1, 2500*1500), ncol=1500)
system.time(
for (i in 1:(nrow(mymatrix))) {
for (j in 1:(ncol(mymatrix))) {
mymatrix[i,j] <- i+j
}
if ((i %% 100)==0) cat(i,"\n")
}
)
(cat output omitted)
and it took
user system elapsed
139.09 55.56 199.42
seconds.
The best strategy is usually to avoid such loops.
For example, obtaining the same results could have been achieved by:
> system.time(
+ roland <- outer(X=1:2500, Y=1:1500, FUN=function(a,b) a+b)
+ )
user system elapsed
0.25 0.09 0.34
Quite a speed-up, I would say, no? Generally using 'outer' and the apply
family (apply, tapply, lapply, sapply -- did I forget one?) can perform
miracles in terms of speed. And it allows also to express ideas in very
elegant ways, in my opinion.
I have to admit, though, that it takes a while to grasp the various
concepts (and I am also still learning).
Maybe you could supply a small, working code example as the posting
guide suggests? This might give you more help for your specific needs.
Hope this helps,
Roland
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list