[R] Need a vectorized way to avoid two nested FOR loops

Rama Ramakrishnan rama at alum.mit.edu
Wed Oct 7 21:52:21 CEST 2009


Hi Friends,

I have a data frame d. Let vars be the column indices for a subset of  
the columns in d (e.g., vars <- c(1,3,4,8))

For each row r in d, I want to collect all the other rows in d that  
match the values in row r for just the columns in vars.

The naive way to do this is to have a for loop stepping through each  
row in d, and within the loop have another loop going through all the  
rows again, checking for equality. This is quadratic in the number of  
rows and takes way too long. Is there a better, "vectorized" way to do  
this?

Thanks in advance!

Rama Ramakrishnan




More information about the R-help mailing list