[R] R version 3.3.2, Windows 10: Applying a function to each possible pair of rows from two different data-frames

Rathore, Saubhagya Singh saubhagya at gatech.edu
Fri Jun 23 21:53:32 CEST 2017


Thank you very much Mr. Gunter for making me realize the power vectorization. I need to work lot more to exploit this strength of R. I applied your suggested method to my problem where D1 and D2 has 600 observations each. The time was significant reduced compared to the fasted working code I had (user:61, system: 0.01 ,  elapsed: 0.62).

Thank you again for your generous help. 
Saubhagya
  
-----Original Message-----
From: Bert Gunter [mailto:bgunter.4567 at gmail.com] 
Sent: Friday, June 23, 2017 3:20 PM
To: Rathore, Saubhagya Singh <saubhagya at gatech.edu>
Cc: r-help at r-project.org
Subject: Re: [R] R version 3.3.2, Windows 10: Applying a function to each possible pair of rows from two different data-frames

You appear to be trying to write C code in R. Don't do this. If you can trade off space for efficiency, the calculation can be easily vectorized (assuming I correctly understand what you want to do, of course).

set.seed(135) ## for reproducibility
D1<-data.frame(x=1:5,y=6:10,z=rnorm(5))
D2<-data.frame(x=19:30,y=41:52,z=rnorm(12))

D.all <-merge(D1,D2, by.x=NULL,by.y=NULL) ## Cartesian product of the two frames

D.all$distance <- sqrt(rowSums((D.all[,1:2] - D.all[,4:5])^2)) ## note use of rowSums D.all$difference <- (D.all[,3] - D.all[,6])^2

D.all



Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Jun 23, 2017 at 8:19 AM, Rathore, Saubhagya Singh <saubhagya at gatech.edu> wrote:
> For certain reason, the content was not visible in the last mail, so posting it again.
>
> Dear Members,
>
> I have two different dataframes with a different number of rows. I need to apply a set of functions to each possible combination of rows with one row coming from 1st dataframe and other from 2nd dataframe. Though I am able to perform this task using for loops, I feel that there must be a more efficient way to do it. An example case is given below. D1 and D2 are two dataframes. I need to evaluate D3 with one column as the Euclidean distance in the x-y plane and second column as squared difference of z values, of each row pair from D1 and D2.
>
> D1<-data.frame(x=1:5,y=6:10,z=rnorm(5))
> D2<-data.frame(x=19:30,y=41:52,z=rnorm(12))
> D3<-data.frame(distance=integer(0),difference=integer(0))
>
> for (i in 1:nrow(D1)){
>
> for (j in 1:nrow(D2))  {
>
> temp<-data.frame(distance=sqrt(sum((D1[i,1:2]-D2[j,1:2])^2)),differenc
> e=(D1[i,3]-D2[j,3])^2)
> D3<-rbind(D3,temp)
> }
> }
>
> Thank you
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of 
> r-help-owner at r-project.org
> Sent: Friday, June 23, 2017 10:47 AM
> To: Rathore, Saubhagya Singh <saubhagya at gatech.edu>
> Subject: R version 3.3.2, Windows 10: Applying a function to each 
> possible pair of rows from two different data-frames
>
> The message's content type was not explicitly allowed
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list