[R] Slow computation in for loop
Yves Brostaux
brostaux.y at fsagx.ac.be
Wed May 28 12:02:54 CEST 2003
First of all, thank you for your response.
I actually have to refine my pseudocode. 'result' is a numerical vector of
length 7, and is binded with whole results through an rbind() :
for (k in replicates) {
data <- sampling from a population
for (i in param1) {
for (j in param2) {
result <- function(i, j, data)
all.results <- rbind(all.results, result)
}
}
}
all.result is at most a 220 rows and 7 columns data frame, which doesn't
seem to be big enough to explain such a slow computation.
Moreover, previous computations with a sample size of 100, which took
individually about 4 seconds at most, ran effectively in a little bit more
than 15 minutes for the whole set.
The problem arise with a sample size of 500, increasing single function
computation time normally, but not the whole process !?
At 11:37 28/05/03, you wrote:
>Yves Brostaux <brostaux.y at fsagx.ac.be> writes:
>
> > Dear members,
> >
> > I'm using R to do some test computation on a set of parameters of a
> > function. This function is included in three for() loops, first one
> > for replications, and the remaining two cycling through possible
> > parameters values, like this :
> >
> > for (k in replicates) {
> > data <- sampling from a population
> > for (i in param1) {
> > for (j in param2) {
> > result <- function(i, j, data)
> > }
> > }
> > }
> >
> > With the 'hardest' set of parameters, a single computation of the
> > function take about 16s on an old Sun Sparc workstation with 64 Mb RAM
> > and don't access a single time to disk.
> >
> > But when I launch the for() loops (which generate 220 function calls),
> > disk gets very sollicitated and the whole process takes as much as 8
> > to 10 hours, instead of the expected 1 hour.
> >
> > What's wrong here ? Is there a thing I don't know about for() loops,
> > and a way to correct it ?
>
>The problem with pseudocode: You didn't really overwrite the "result"
>every time did you? I bet you stored it somewhere.
>
>Two common causes of inefficiency are (a) that the stored objects may
>be large and (b) some naive ways of storing the results involve
>copying all preceding results, e.g.
>
>list.of.results <- list()
>for (.....){
> result <- ...
> list.of.results <- c(list.of.results, result)
>}
>
>The fix for (a) is to extract what you need and discard the rest
>and for (b) to allocate the list up front with the proper length and
>assign to list.of.results[[i]].
>
>--
> O__ ---- Peter Dalgaard Blegdamsvej 3
> c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
>~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list