[R] a merge() problem

Peter Ehlers ehlers at ucalgary.ca
Mon Oct 8 03:57:29 CEST 2012


On 2012-10-07 14:44, Sam Steingold wrote:
>> * Peter Ehlers <ruyref at hpnytnel.pn> [2012-10-07 10:03:42 -0700]:
>>
>> On 2012-10-07 08:34, Sam Steingold wrote:
>>> I know it does not look very good - using the same column names to mean
>>> different things in different data frames, but here you go:
>>> --8<---------------cut here---------------start------------->8---
>>>> x <- data.frame(a=c(1,2,3),b=c(4,5,6))
>>>> y <- data.frame(b=c(1,2),a=c("a","b"))
>>>> merge(x,y,by.x="a",by.y="b",all.x=TRUE,suffixes=c("","y"))
>>>     a b    a
>>> 1 1 4    a
>>> 2 2 5    b
>>> 3 3 6 <NA>
>>> Warning message:
>>> In merge.data.frame(x, y, by.x = "a", by.y = "b", all.x = TRUE) :
>>>     column name 'a' is duplicated in the result
>>> --8<---------------cut here---------------end--------------->8---
>>> why is the suffixes argument ignored?
>>> I mean, I expected that the second "a" to be "a.y".
>>
>> The 'suffixes' argument refers to _non-by_ names only (as per ?merge).
>
> yes, but "a" in "y" is _not_ a by-name.

Yes, it is.
The set of by-names is the union of names specified by by.x and by.y,
in your case: c("a", "b").
I suppose that a case could be made that ?merge does not spell that
out sufficiently explicitly.

Peter Ehlers




More information about the R-help mailing list