# [R] How to speed up multiple for loop over list of data frames

Patrick Burns pburns at pburns.seanet.com
Wed Oct 17 16:57:00 CEST 2007

```I suspect the vast majority of time is because of
growing objects.

Preallocate 'iv', 'jv', 'rho_sv' and 'rho_pv' to be their
final length and then subscript into them with their
values.

Patrick Burns
patrick at burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")

jim holtman wrote:

>First thing to do is to use Rprof (?Rprof) on a subset of your data to
>see where time is being spent.  My guess is that most of it is in the
>calls to 'cor' and if this is the case, they you have to figure out
>some other algorithm.
>
>Also if these dataframes all contain numeric information, convert them
>to matrices intially because the subsetting that you are doing on the
>dataframe (e.g., alist[[p]][i,"v"]) can be very expensive.  The output
>from Rprof will help determine what course of action you should take.
>
>On 10/16/07, Dieter Best <dieterbest_2000 at yahoo.com> wrote:
>
>
>>Hi there,
>>
>> I have a multiple for loop over a list of data frames
>>
>> for ( i in 1:(N-1) ) {
>>   for ( j in (i+1):N ) {
>>       for ( p in 1:M ) {
>>           v_i[p]    = alist[[p]][i,"v"]
>>           v_j[p]    = alist[[p]][j,"v"]
>>       }
>>       rho_s = cor(v_i, v_j, method = "spearman")
>>       rho_p = cor(v_i, v_j, method = "pearson" )
>>       iv     = c( iv, min(i, j) )
>>       jv     = c( jv, max(i, j) )
>>       rho_sv = c( rho_sv, rho_s)
>>       rho_pv = c( rho_pv, rho_p)
>>   }
>>}
>>
>> N is of the order of 400, M about 800.
>>
>> This takes me an entire day basically. Is there anything I could do to speed things up or is cor really that slow?
>>
>> -- D
>>
>>
>>
>>---------------------------------
>>
>>
>>       [[alternative HTML version deleted]]
>>
>>______________________________________________
>>R-help at r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help