[R] Slow computation in for loop

Yves Brostaux brostaux.y at fsagx.ac.be
Wed May 28 12:02:54 CEST 2003


First of all, thank you for your response.

I actually have to refine my pseudocode. 'result' is a numerical vector of 
length 7, and is binded with whole results through an rbind() :

for (k in replicates) {
   data <- sampling from a population
   for (i in param1) {
     for (j in param2) {
        result <- function(i, j, data)
        all.results <- rbind(all.results, result)
     }
   }
}

all.result is at most a 220 rows and 7 columns data frame, which doesn't 
seem to be big enough to explain such a slow computation.

Moreover, previous computations with a sample size of 100, which took 
individually about 4 seconds at most, ran effectively in a little bit more 
than 15 minutes for the whole set.

The problem arise with a sample size of 500, increasing single function 
computation time normally, but not the whole process !?


At 11:37 28/05/03, you wrote:
>Yves Brostaux <brostaux.y at fsagx.ac.be> writes:
>
> > Dear members,
> >
> > I'm using R to do some test computation on a set of parameters of a
> > function. This function is included in three for() loops, first one
> > for replications, and the remaining two cycling through possible
> > parameters values, like this :
> >
> > for (k in replicates) {
> >    data <- sampling from a population
> >    for (i in param1) {
> >      for (j in param2) {
> >         result <- function(i, j, data)
> >      }
> >    }
> > }
> >
> > With the 'hardest' set of parameters, a single computation of the
> > function take about 16s on an old Sun Sparc workstation with 64 Mb RAM
> > and don't access a single time to disk.
> >
> > But when I launch the for() loops (which generate 220 function calls),
> > disk gets very sollicitated and the whole process takes as much as 8
> > to 10 hours, instead of the expected 1 hour.
> >
> > What's wrong here ? Is there a thing I don't know about for() loops,
> > and a way to correct it ?
>
>The problem with pseudocode: You didn't really overwrite the "result"
>every time did you? I bet you stored it somewhere.
>
>Two common causes of inefficiency are (a) that the stored objects may
>be large and (b) some naive ways of storing the results involve
>copying all preceding results, e.g.
>
>list.of.results <- list()
>for (.....){
>      result <- ...
>      list.of.results <- c(list.of.results, result)
>}
>
>The fix for (a) is to extract what you need and discard the rest
>and for (b) to allocate the list up front with the proper length and
>assign to list.of.results[[i]].
>
>--
>    O__  ---- Peter Dalgaard             Blegdamsvej 3
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
>~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list