[R] Contatenating data frames with partial overlap in variable names
Marc Schwartz
marc_schwartz at comcast.net
Sun Mar 25 04:00:32 CEST 2007
On Sat, 2007-03-24 at 21:47 -0400, Daniel Folkinshteyn wrote:
> Greetings to all.
> I need to concatenate data frames that do not have all the same variable
> names, there is only a partial overlap in the variables. So, for
> example, if i have two data frames, a and b, that look like the following:
> > a
> a b
> 1 1 4
> 2 2 5
> 3 3 6
> 4 4 7
> 5 5 8
> > b
> c a
> 1 1 10
> 2 2 11
> 3 3 12
> 4 4 13
> 5 5 14
>
> i want to concatenate them by row, without any matching, so that the
> variables that are not available in all frames get NAs. The result
> should look like:
>
> a b c
> 1 1 4 NA
> 2 2 5 NA
> 3 3 6 NA
> 4 4 7 NA
> 5 5 8 NA
> 6 10 NA 1
> 7 11 NA 2
> 8 12 NA 3
> 9 13 NA 4
> 10 14 NA 5
>
> rbind doesn't work, since it requires all variables to be matched
> between the two data frames. merge doesn't work, since it wants to
> /match/ by columns with the same name, and if matching by nothing,
> produces a cartesian product.
>
> is there a neat trick for doing this simply, or am i stuck with
> comparing variable lists and generating NAs manually?
>
> would appreciate any help!
> Daniel
You can use merge():
> a
a b
1 1 4
2 2 5
3 3 6
4 4 7
5 5 8
> b
c a
1 1 10
2 2 11
3 3 12
4 4 13
5 5 14
Use 'a' as the common 'by' column and specify 'all = TRUE' so that
non-matching values of 'a' will be included in the result:
> merge(a, b, by = "a", all = TRUE)
a b c
1 1 4 NA
2 2 5 NA
3 3 6 NA
4 4 7 NA
5 5 8 NA
6 10 NA 1
7 11 NA 2
8 12 NA 3
9 13 NA 4
10 14 NA 5
See ?merge for more information.
HTH,
Marc Schwartz
More information about the R-help
mailing list