[R] create list of names where two df contain == values
Dennis Murphy
djmuser at gmail.com
Wed Nov 16 16:03:21 CET 2011
Hi:
I think you're overthinking this problem. As is usually the case in R,
a vectorized solution is clearer and provides more easily understood
code.
It's not obvious to me exactly what you want, so we'll try a couple of
variations on the same idea. Equality of floating point numbers is a
difficult computational problem (see R FAQ 7.31), but if it makes
sense to define a threshold difference between floating numbers that
practically equates to zero, then you're in business. In your example,
the difference in numb1 for letter h in the two data frames is far
from zero, so define 'equal' to be a difference < 10 ^{-6}. Then:
# Return the entire matching data frame
df.1[abs(df.1$numb1 - df.2$numb1) < 0.000001, ]
Letters numb1 extra.col id
1 a 0.3735462 1 CG234
2 b 1.1836433 2 CG232
3 c 0.1643714 3 CG441
4 d 2.5952808 4 CG128
5 e 1.3295078 5 CG125
6 f 0.1795316 6 CG182
7 g 1.4874291 7 CG982
9 i 1.5757814 9 CG282
10 j 0.6946116 10 CG154
# Return the matching letters only as a vector:
df.1[abs(df.1$numb1 - df.2$numb1) < 0.000001, 'Letters' ]
If you want the latter object to remain a data frame, use drop = FALSE
as an extra argument after 'Letters'. If you want to create a list
object such that each letter comprises a different list component,
then the following will do - the as.character() part coerces the
factor Letters into a character object:
as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1) < 0.000001,
'Letters' ]))
HTH,
Dennis
On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin <robgriffin247 at hotmail.com> wrote:
> Hello again... sorry to be posting yet again, but I hadn't anticipated this
> problem.
>
> I am trying to now put the names found in one column in data frame 1 (lets
> call it df.1[,1]) in to a list from the rows where the values in df.1[,2]
> match values in a column of another dataframe (df.2[3])
> I tried to write this function so that it put the list of names (called
> Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its
> too complex for a beginner R-enthusiast
>
> ify<-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL}
> Iffy<-apply( df.1, 1, FUN=ify, x=df.1, y=df.2, a=2, b=3, c=1 )
>
> But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s)
> (newX[, i])
>
>
> Here is a dataset that replicates the problem, you'll notice the "h"
> criteria values are different between the two dataframes and therefore it
> would produce a list of the 9 letters where the two criteria columns
> matched (a,b,c,d,e,f,g,i,j):
>
>
>
> df.1<-data.frame(rep(letters[1:10]))
> colnames(df.1)[1]<-("Letters")
> set.seed(1)
> df.1$numb1<-rnorm(10,1,1)
> df.1$extra.col<-c(1,2,3,4,5,6,7,8,9,10)
> df.1$id<-c("CG234","CG232","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154")
> df.1
>
> df.2<-data.frame(rep(letters[1:10]))
> colnames(df.2)[1]<-("Letters")
> set.seed(1)
> df.2$extra.col<-c(1,2,3,4,5,6,7,8,9,10)
> df.2$numb1<-rnorm(10,1,1)
> df.2$id<-c("CG234","CG232","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154")
> df.2[8,3]<-12
>
> df.1
> df.2
>
>
>
>
> Your patience is much appreciated,
> Rob
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list