[R] Merging two data frames, but keeping NAs

Sarah Goslee sarah.goslee at gmail.com
Thu Dec 5 16:11:22 CET 2013


Adding the argument all.x=TRUE to merge() will retain the NA values,
but the only reliable way I've found to preserve order with NA values
in a merge is to add an index column to x, merge the data, sort on the
index column, then delete it.

Sarah

On Thu, Dec 5, 2013 at 9:56 AM, Rainer M Krug <Rainer at krugs.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi
>
> My brain is giving up on this...
>
> I have the following two data.frames:
>
>   x <-  data.frame(ref=c(NA, NA, NA, 10:5, NA, 1:5))
>   y <-  data.frame(id = c(2, 3, 4, 6, 7, 9, 8), val = 101:107)
>
> Which look as follow:
>
>> x
>    ref
> 1   NA
> 2   NA
> 3   NA
> 4   10
> 5    9
> 6    8
> 7    7
> 8    6
> 9    5
> 10  NA
> 11   1
> 12   2
> 13   3
> 14   4
> 15   5
>> y
>   id val
> 1  2 101
> 2  3 102
> 3  4 103
> 4  6 104
> 5  7 105
> 6  9 106
> 7  8 107
>>
>
> Now I want to merge y into x, but that
>
> a) the sort order of x stays the same (sort=FALSE in merge()) and
> b) the NAs stay
>
> The result should look as follow (column id only here for clarity):
>
>> result
>    ref  id  val
> 1   NA  NA  NA
> 2   NA  NA  NA
> 3   NA  NA  NA
> 4   10  NA  NA
> 5    9   9   106
> 6    8   8   107
> 7    7   7   105
> 8    6   6   104
> 9    5  NA  NA
> 10  NA  NA  NA
> 11   1  NA  NA
> 12   2   2  101
> 13   3   3  102
> 14   4   4  103
> 15   5  NA  NA
>
> merge(x, y, by.x="ref", by.y="id", sort=FALSE) leaves out the NA, but
> otherwise it works:
>
>> merge(x, y, by.x=1, by.y="id", sort=FALSE)
>   ref val
> 1   9 106
> 2   8 107
> 3   7 105
> 4   6 104
> 5   2 101
> 6   3 102
> 7   4 103
>
> Is there any way that I can tell merge() to keep the NA, or how can I
> achieve what I want?
>
> Thanks,
>
> Rainer
>

-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list