[R] Loops and dataframes
Sean Davis
sdavis2 at mail.nih.gov
Fri Feb 25 12:35:08 CET 2005
On Feb 25, 2005, at 6:06 AM, Firas Swidan wrote:
> Hi,
> I am experiencing a long delay when using dataframes inside loops and
> was
> wordering if this is a bug or not.
> Example code:
>
>> st <- rep(1,100000)
>> ed <- rep(2,100000)
>> for(i in 1:length(st)) st[i] <- ed[i] # works fine
>> df <- data.frame(start=st,end=ed)
>> for(i in 1:dim(df)[1]) df[i,1] <- df[i,2] #takes for ever
>
> R: R 2.0.0 (2004-10-04)
> OS: Linux, Fedora Core 2
> kernel: 2.6.10-1.14_FC2
> cpu: AMD Athlon XP 1600.
> mem: 500MB.
>
> The example above is only to illustrate the problem. I need loops to
> apply
> some functions on pairs (not necessarily successive) of rows in a
> dataframe.
I'm not an expert, but working with dataframes is typically slower than
the eqivalent matrix. If it is possible (the data is of the same type,
as it is above), working with the equivalent matrix is prabably faster.
So, I think the general answer to the implied question is that
dataframe processing is slower than vector processing or the equivalent
matrix processing.
If you post more details about your specific problem, folks may be able
to find creative ways of speeding things up, if speed remains a
concern.
Sean
More information about the R-help
mailing list