[R] merge function while obviating duplicate columns XXXX
William Dunlap
wdunlap at tibco.com
Mon Mar 11 22:24:03 CET 2013
You can use the set-oriented functions setdiff(), union(), and intersect().
E.g., setdiff(colnames(data2), colnames(data1)) gives the names of columns
of data2 that are not names of columns of data1. The following might be
what you want
merge(data1, data2[, c("id", setdiff(colnames(data2),colnames(data1)))], by="id")
You didn't give an example of the data nor the desired result so I made some up:
> data1 <- data.frame(id=c(1,1,2,3), Name=c("Joe","Joe","Ken","Leo"))
> data2 <- data.frame(id=c(2,3), Name=c("Melody","Nell"), Age=c(45,49))
> merge(data1, data2, by="id")
id Name.x Name.y Age
1 2 Ken Melody 45
2 3 Leo Nell 49
> merge(data1, data2[, c("id", setdiff(colnames(data2),colnames(data1)))], by="id")
id Name Age
1 2 Ken 45
2 3 Leo 49
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Dan Abner
> Sent: Monday, March 11, 2013 2:02 PM
> To: Ista Zahn
> Cc: r-help at r-project.org
> Subject: Re: [R] merge function while obviating duplicate columns XXXX
>
> Ok, let's say I only want the common columns from data1. Is there a
> succinct way of doing this for potentially hundreds of "in common"
> columns?
>
>
>
> On Mon, Mar 11, 2013 at 3:25 PM, Ista Zahn <istazahn at gmail.com> wrote:
> > On Mon, Mar 11, 2013 at 3:17 PM, Dan Abner <dan.abner99 at gmail.com> wrote:
> >> Hi everyone,
> >>
> >> I have the following call to the merge() function. How does one
> >> prevent duplicate columns in the resulting data frame that the 2
> >> parent data frames have in common but are not true key or "by"
> >> variables?
> >>
> >>
> >> data3<-merge(data1,data2,by="id")
> >> data3
> >>
> >> id total.x total.y balance
> >> 1 78 78 90
> >> 2 91 91 63
> >> 3 74 74 57
> >> 4 89 89 58
> >> 5 90 90 27
> >>
> >>
> >> In this example, total is not a true key or "by" variable that
> >> uniquely identifies rows suitable for matching purposes, but instead
> >> just happens to be common to both sets.
> >
> > Well, which one do you want? Or do you want to exclude total from the result?
> >
> >>
> >> In reality, I have hundreds for these "in common" variables, so I need
> >> a solution that is tractable for a large number of "in common"
> >> columns.
> >>
> >> Thanks!
> >>
> >> Dan
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list