[R] Loops and dataframes

Sean Davis sdavis2 at mail.nih.gov
Fri Feb 25 12:35:08 CET 2005


On Feb 25, 2005, at 6:06 AM, Firas Swidan wrote:

> Hi,
> I am experiencing a long delay when using dataframes inside loops and 
> was
> wordering if this is a bug or not.
> Example code:
>
>> st <- rep(1,100000)
>> ed <- rep(2,100000)
>> for(i in 1:length(st)) st[i] <- ed[i] # works fine
>> df <- data.frame(start=st,end=ed)
>> for(i in 1:dim(df)[1]) df[i,1] <- df[i,2] #takes for ever
>
> R: R 2.0.0 (2004-10-04)
> OS: Linux, Fedora Core 2
> kernel: 2.6.10-1.14_FC2
> cpu: AMD Athlon XP 1600.
> mem: 500MB.
>
> The example above is only to illustrate the problem. I need loops to 
> apply
> some functions on pairs (not necessarily successive) of rows in a
> dataframe.

I'm not an expert, but working with dataframes is typically slower than 
the eqivalent matrix.  If it is possible (the data is of the same type, 
as it is above), working with the equivalent matrix is prabably faster. 
  So, I think the general answer to the implied question is that 
dataframe processing is slower than vector processing or the equivalent 
matrix processing.

If you post more details about your specific problem, folks may be able 
to find creative ways of speeding things up, if speed remains a 
concern.

Sean




More information about the R-help mailing list