jim holtman
jholtman at gmail.com
Wed Oct 17 15:44:08 CEST 2007
First thing to do is to use Rprof (?Rprof) on a subset of your data to
see where time is being spent. My guess is that most of it is in the
calls to 'cor' and if this is the case, they you have to figure out
some other algorithm.
Also if these dataframes all contain numeric information, convert them
to matrices intially because the subsetting that you are doing on the
dataframe (e.g., alist[[p]][i,"v"]) can be very expensive. The output
from Rprof will help determine what course of action you should take.
On 10/16/07, Dieter Best <dieterbest_2000 at yahoo.com> wrote:
> Hi there,
>
> I have a multiple for loop over a list of data frames
>
> for ( i in 1:(N-1) ) {
> for ( j in (i+1):N ) {
> for ( p in 1:M ) {
> v_i[p] = alist[[p]][i,"v"]
> v_j[p] = alist[[p]][j,"v"]
> }
> rho_s = cor(v_i, v_j, method = "spearman")
> rho_p = cor(v_i, v_j, method = "pearson" )
> iv = c( iv, min(i, j) )
> jv = c( jv, max(i, j) )
> rho_sv = c( rho_sv, rho_s)
> rho_pv = c( rho_pv, rho_p)
> }
> }
>
> N is of the order of 400, M about 800.
>
> This takes me an entire day basically. Is there anything I could do to speed things up or is cor really that slow?
>
> -- D
>
>
>
