[R] create list of names where two df contain == values

Dennis Murphy djmuser at gmail.com
Wed Nov 16 16:03:21 CET 2011


Hi:

I think you're overthinking this problem. As is usually the case in R,
a vectorized solution is clearer and provides more easily understood
code.

It's not obvious to me exactly what you want, so we'll try a couple of
variations on the same idea. Equality of floating point numbers is a
difficult computational problem (see R FAQ 7.31), but if it makes
sense to define a threshold difference between floating numbers that
practically equates to zero, then you're in business. In your example,
the difference in numb1 for letter h in the two data frames is far
from zero, so define 'equal' to be a difference < 10 ^{-6}. Then:

# Return the entire matching data frame
df.1[abs(df.1$numb1 - df.2$numb1) < 0.000001, ]
   Letters     numb1 extra.col    id
1        a 0.3735462         1 CG234
2        b 1.1836433         2 CG232
3        c 0.1643714         3 CG441
4        d 2.5952808         4 CG128
5        e 1.3295078         5 CG125
6        f 0.1795316         6 CG182
7        g 1.4874291         7 CG982
9        i 1.5757814         9 CG282
10       j 0.6946116        10 CG154

# Return the matching letters only as a vector:
df.1[abs(df.1$numb1 - df.2$numb1) < 0.000001, 'Letters' ]

If you want the latter object to remain a data frame, use drop = FALSE
as an extra argument after 'Letters'. If you want to create a list
object such that each letter comprises a different list component,
then the following will do - the as.character() part coerces the
factor Letters into a character object:

as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1) < 0.000001,
             'Letters' ]))

HTH,
Dennis


On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin <robgriffin247 at hotmail.com> wrote:
> Hello again... sorry to be posting yet again, but I hadn't anticipated this
> problem.
>
> I am trying to now put the names found in one column in data frame 1 (lets
> call it df.1[,1]) in to a list from the rows where the values in df.1[,2]
> match values in a column of another dataframe (df.2[3])
> I tried to write this function so that it put the list of names (called
> Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its
> too complex for a beginner R-enthusiast
>
> ify<-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL}
> Iffy<-apply(  df.1,  1,  FUN=ify,  x=df.1,  y=df.2,  a=2,  b=3,  c=1  )
>
> But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s)
> (newX[, i])
>
>
> Here is a dataset that replicates the problem, you'll notice the "h"
> criteria values are different between the two dataframes and therefore it
> would produce a list  of the 9 letters where the two criteria columns
> matched (a,b,c,d,e,f,g,i,j):
>
>
>
> df.1<-data.frame(rep(letters[1:10]))
> colnames(df.1)[1]<-("Letters")
> set.seed(1)
> df.1$numb1<-rnorm(10,1,1)
> df.1$extra.col<-c(1,2,3,4,5,6,7,8,9,10)
> df.1$id<-c("CG234","CG232","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154")
> df.1
>
> df.2<-data.frame(rep(letters[1:10]))
> colnames(df.2)[1]<-("Letters")
> set.seed(1)
> df.2$extra.col<-c(1,2,3,4,5,6,7,8,9,10)
> df.2$numb1<-rnorm(10,1,1)
> df.2$id<-c("CG234","CG232","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154")
> df.2[8,3]<-12
>
> df.1
> df.2
>
>
>
>
> Your patience is much appreciated,
> Rob
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list