[R] create list of names where two df contain == values
David Winsemius
dwinsemius at comcast.net
Wed Nov 16 15:04:22 CET 2011
On Nov 16, 2011, at 8:03 AM, Rob Griffin wrote:
> Hello again... sorry to be posting yet again, but I hadn't
> anticipated this problem.
>
> I am trying to now put the names found in one column in data frame 1
> (lets call it df.1[,1]) in to a list from the rows where the values
> in df.1[,2] match values in a column of another dataframe (df.2[3])
> I tried to write this function so that it put the list of names
> (called Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched
> but I think its too complex for a beginner R-enthusiast
>
> ify<-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else
> {NULL}
When you are building a helper function for use with apply, your
should realize that tat function will be getting a vector, not a list.
The construction "[[,a]]" looks pretty strange as well. Generally
column selection is done with one of "[[a]]" or "[ , a]". I am not
absolutely sure that you cannot have "[[,]]" but I was under the
impression you could not. AND you shouldn't be retruning NULLs if what
yoyr really want are NA's.
> Iffy<-apply( df.1, 1, FUN=ify, x=df.1, y=df.2, a=2, b=3,
> c=1 )
So a single vector will be assigned to the x argument in the ify
function and the rest of the arguments will be populated from the
other arguments. You do NOT need to supply an "x" argument in that
list and if you do so you will throw an error.
Furthermore you cannot expect the apply function to keep track of
which row it's one for indexing a different data.frame. The mapply
function might be used for this purpose but I am going to suggest a
much cleaner solution below.
>
> But this didn't work... Error in FUN(newX[, i], ...) : unused
> argument(s) (newX[, i])
>
>
> Here is a dataset that replicates the problem, you'll notice the "h"
> criteria values are different between the two dataframes and
> therefore it would produce a list of the 9 letters where the two
> criteria columns matched (a,b,c,d,e,f,g,i,j):
If you know that df.1 and df.2 have the same number of rows then use
the ifelse function which is designed to work on vectors. The if)_else
construct is NOT:
> ifelse( df.1[,2] ==df.2[,3], {as.character(df.1[,1])} , {NA} )
[1] "a" "b" "c" "d" "e" "f" "g" NA "i" "j"
The reason as.character was needed lies in that fact that you
constructed df.1[,1] as a factor variable. AS I understand it, the
ifelse tries to make it numeric to match the datatype of the
comaprison. I've never understood this frankly. Maybe someoen can
educate me.
If you wanted a function that allowed you to specify the columns and
dataframes then consider this
ret3.m1.eq.n2 <- function(df1, df2, col1, col2, col3){
ifelse( df1[,col1] ==df2[,col2],
{as.character(df1[,col3])} , {NA} )
>
>
>
> df.1<-data.frame(rep(letters[1:10]))
> colnames(df.1)[1]<-("Letters")
> set.seed(1)
> df.1$numb1<-rnorm(10,1,1)
> df.1$extra.col<-c(1,2,3,4,5,6,7,8,9,10)
> df.1$id<-
> c
> ("CG234
> ","CG232
> ","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154")
> df.1
>
> df.2<-data.frame(rep(letters[1:10]))
> colnames(df.2)[1]<-("Letters")
> set.seed(1)
> df.2$extra.col<-c(1,2,3,4,5,6,7,8,9,10)
> df.2$numb1<-rnorm(10,1,1)
> df.2$id<-
> c
> ("CG234
> ","CG232
> ","CG441","CG128","CG125","CG182","CG982","CG541","CG282","CG154")
> df.2[8,3]<-12
>
> df.1
> df.2
>
>
>
>
> Your patience is much appreciated,
> Rob
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list