[R] R version 3.3.2, Windows 10: Applying a function to each possible pair of rows from two different data-frames

Rui Barradas ruipbarradas at sapo.pt
Fri Jun 23 18:02:38 CEST 2017


Hello,

Another way would be

n <- nrow(expand.grid(1:nrow(D1), 1:nrow(D2)))
D5 <- data.frame(distance=integer(n),difference=integer(n))

D5[] <- do.call(rbind, lapply(seq_len(nrow(D1)), function(i) 
t(sapply(seq_len(nrow(D2)), function(j){
	 
c(distance=sqrt(sum((D1[i,1:2]-D2[j,1:2])^2)),difference=(D1[i,3]-D2[j,3])^2)
	}
))))

identical(D3, D5)


In my first answer I forgot to say that constructs like 1:nrow(...) or 
more generally 1:m are error prone. If m == 0 you will have the 
perfectly legal loop for(i in 1:0) but an illegal zero index.
The solution is to use ?seq_len or ?seq_along (same help page). Like 
this: for(i in seq_len(m)). In your case m is either nrow(D1) or nrow(D2).

Hope this helps,

Rui Barradas



Em 23-06-2017 16:35, Rui Barradas escreveu:
> Hello,
>
> The obvious way would be to preallocate the resulting data.frame, to
> expand an empty one on each iteration being a time expensive operation.
>
> n <- nrow(expand.grid(1:nrow(D1), 1:nrow(D2)))
> D4 <- data.frame(distance=integer(n),difference=integer(n))
> k <- 0
> for (i in 1:nrow(D1)){
>      for (j in 1:nrow(D2))  {
>          k <- k + 1
>          D4[k, ] <-
> c(distance=sqrt(sum((D1[i,1:2]-D2[j,1:2])^2)),difference=(D1[i,3]-D2[j,3])^2)
>
>      }
> }
>
> identical(D3, D4)
>
> Hope this helps,
>
> Rui Barradas
>
> Em 23-06-2017 16:19, Rathore, Saubhagya Singh escreveu:
>> For certain reason, the content was not visible in the last mail, so
>> posting it again.
>>
>> Dear Members,
>>
>> I have two different dataframes with a different number of rows. I
>> need to apply a set of functions to each possible combination of rows
>> with one row coming from 1st dataframe and other from 2nd dataframe.
>> Though I am able to perform this task using for loops, I feel that
>> there must be a more efficient way to do it. An example case is given
>> below. D1 and D2 are two dataframes. I need to evaluate D3 with one
>> column as the Euclidean distance in the x-y plane and second column as
>> squared difference of z values, of each row pair from D1 and D2.
>>
>> D1<-data.frame(x=1:5,y=6:10,z=rnorm(5))
>> D2<-data.frame(x=19:30,y=41:52,z=rnorm(12))
>> D3<-data.frame(distance=integer(0),difference=integer(0))
>>
>> for (i in 1:nrow(D1)){
>>
>> for (j in 1:nrow(D2))  {
>>
>> temp<-data.frame(distance=sqrt(sum((D1[i,1:2]-D2[j,1:2])^2)),difference=(D1[i,3]-D2[j,3])^2)
>>
>> D3<-rbind(D3,temp)
>> }
>> }
>>
>> Thank you
>>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
>> r-help-owner at r-project.org
>> Sent: Friday, June 23, 2017 10:47 AM
>> To: Rathore, Saubhagya Singh <saubhagya at gatech.edu>
>> Subject: R version 3.3.2, Windows 10: Applying a function to each
>> possible pair of rows from two different data-frames
>>
>> The message's content type was not explicitly allowed
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list