[R] Need a vectorized way to avoid two nested FOR loops

joris meys jorismeys at gmail.com
Thu Oct 8 14:34:10 CEST 2009


Neat piece of code, Jim, but it still uses a nested loop. If you order
the matrix first, you only need one passage through the whole matrix
to find the information you need.

Off course I don't take into account the ordering. If the ordering
algorithm doesn't work in linear time, then it doesn't really matter I
guess. The limiting step would become the ordering algorithm.

Kind regards
Joris



On Thu, Oct 8, 2009 at 2:24 PM, jim holtman <jholtman at gmail.com> wrote:
> I answered the wrong question.  Here is the code to find all the
> matches for each row:
>
> n <- 20
> set.seed(2)
> # create test dataframe
> x <- as.data.frame(matrix(sample(1:2,n*6, TRUE), nrow=n))
> x
> x.col <- c(1,3,5)
>
> # match against all the other rows
> x.match1 <- apply(x[, x.col], 1, function(a){
>    .mat <- which(apply(x[, x.col], 1, function(z){
>        all(a == z)
>    }))
> })
>
> # remove matches to itself
> x.match2 <- lapply(seq(length(x.match1)), function(z){
>    x.match1[[z]][!(x.match1[[z]] %in% z)]
> })
> # x.match2 contains which rows indices match
>
>
>
>
>
>
>
>
>
>
> On Wed, Oct 7, 2009 at 3:52 PM, Rama Ramakrishnan <rama at alum.mit.edu> wrote:
>>
>> Hi Friends,
>>
>> I have a data frame d. Let vars be the column indices for a subset of the
>> columns in d (e.g., vars <- c(1,3,4,8))
>>
>> For each row r in d, I want to collect all the other rows in d that match
>> the values in row r for just the columns in vars.
>>
>> The naive way to do this is to have a for loop stepping through each row in
>> d, and within the loop have another loop going through all the rows again,
>> checking for equality. This is quadratic in the number of rows and takes way
>> too long. Is there a better, "vectorized" way to do this?
>>
>> Thanks in advance!
>>
>> Rama Ramakrishnan
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list