[R] matching type question, please

Fri Dec 17 18:00:31 CET 2021

Hello,

Inline.

Às 23:29 de 16/12/21, Bert Gunter escreveu:
> Not sure what you mean by this:
> 
> "But this only works if the vectors xr* are longer than xs*."
> 
> The solution I gave doesn't care about this.
>> a <- rbind(unique(z2),unique(z1))
>> a[duplicated(a),]
>       xs1 xs2
> ## as before
> Presumably you are referring to your use of match() (which is how %in%
> is defined).

Yes, that's what I meant. It's probably more time and memory efficient 
to have cbind create a matrix of length(xr1) rows and extract the wanted 
ones with a logical index also of that length. The use of match() will 
avoid creating (maybe) longer data sets but it will raise the problem I 
mentioned.

Rui Barradas

> 
> Bert
> 
> 
> 
> On Thu, Dec 16, 2021 at 2:51 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>
>> Hello,
>>
>> And here is another solution, addressing the problem raised by Bert and
>> avoiding unique.
>>
>>
>> xr1 <- 8:0
>> xr2 <- 0:8
>> xs1 <- 9:3
>> xs2 <- 4
>> cbind(xr1, xr2)[(xr1 %in% xs1) & (xr2 %in% xs2),]
>> #xr1 xr2
>> #  4   4
>>
>>
>> xr1 <- c(1,2,1)
>> xr2 <- c(4,5,4)
>> xs1 <- c(6,6)
>> xs2 <- c(7,7)
>> cbind(xr1, xr2)[(xr1 %in% xs1) & (xr2 %in% xs2),]
>> #   xr1 xr2
>> (only column names are output)
>>
>>
>> But this only works if the vectors xr* are longer than xs*. Try swapping
>> the test values (both sets, Erin's original and Bert's) and see.
>>
>> So here is a function that checks lengths first, then takes the right
>> branch.
>>
>>
>> dupSpecial <- function(x1, x2, y1, y2){
>>     if(length(x1) > length(y1)){
>>       cbind(x1, x2)[(x1 %in% y1) & (x2 %in% y2),]
>>     } else {
>>       cbind(y1, y2)[(y1 %in% x1) & (y2 %in% x2),]
>>     }
>> }
>> dupSpecial(xr1, xr2, xs1, xs2)
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>> Às 22:01 de 16/12/21, Bert Gunter escreveu:
>>> I am not sure Eric's solution is what is wanted:
>>>
>>> Consider:
>>> xr1 <- c(1,2,1)
>>> xr2 <- c(4,5,4)
>>> xs1 <- c(6,6)
>>> xs2 <- c(7,7)
>>>
>>>> z1 <- cbind(xr1, xr2)
>>>> z2 <- cbind(xs1,xs2)
>>>> z1
>>>        xr1 xr2
>>> [1,]   1   4
>>> [2,]   2   5
>>> [3,]   1   4
>>>> z2
>>>        xs1 xs2
>>> [1,]   6   7
>>> [2,]   6   7
>>>
>>> If what is wanted is to find rows of z2 that match those in z1, Eric's
>>> proposal gives (note the added comma to give a logical indexing
>>> vector):
>>>
>>>> a <- cbind(c(xr1,xs1),c(xr2,xs2))
>>>> a[duplicated(a),]
>>>        [,1] [,2]
>>> [1,]    1    4
>>> [2,]    6    7
>>>
>>> This is obviously wrong, as it gives duplicates *within* z1 and z2,
>>> not between them. To get rows of z2 that appear as duplicates of rows
>>> of z1, then something like the following should do:
>>>
>>>> a <- rbind(unique(z1),unique(z2))
>>>> a
>>>        xr1 xr2
>>> [1,]   1   4
>>> [2,]   2   5
>>> [3,]   6   7
>>>> a[duplicated(a),]
>>>        xr1 xr2
>>> ## nothing
>>>
>>> I leave it to Erin to determine whether this is relevant to her
>>> problem and, if so, how to fix up my suggestion appropriately.
>>>
>>> Cheers,
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>> On Thu, Dec 16, 2021 at 12:39 PM Eric Berger <ericjberger using gmail.com> wrote:
>>>>
>>>>> a <- cbind(c(xr1,xs1),c(xr2,xs2))
>>>>> a[duplicated(a)]
>>>> [1] 4 4
>>>>
>>>>
>>>> On Thu, Dec 16, 2021 at 10:18 PM Erin Hodgess <erinm.hodgess using gmail.com> wrote:
>>>>>
>>>>> Hello!
>>>>>
>>>>> I have the following:
>>>>>
>>>>>    cbind(xr1,xr2)
>>>>>
>>>>>         xr1 xr2
>>>>>
>>>>>    [1,]   8   0
>>>>>
>>>>>    [2,]   7   1
>>>>>
>>>>>    [3,]   6   2
>>>>>
>>>>>    [4,]   5   3
>>>>>
>>>>>    [5,]   4   4
>>>>>
>>>>>    [6,]   3   5
>>>>>
>>>>>    [7,]   2   6
>>>>>
>>>>>    [8,]   1   7
>>>>>
>>>>>    [9,]   0   8
>>>>>
>>>>>> cbind(xs1,xs2)
>>>>>
>>>>>        xs1 xs2
>>>>>
>>>>> [1,]   9   4
>>>>>
>>>>> [2,]   8   4
>>>>>
>>>>> [3,]   7   4
>>>>>
>>>>> [4,]   6   4
>>>>>
>>>>> [5,]   5   4
>>>>>
>>>>> [6,]   4   4
>>>>>
>>>>> [7,]   3   4
>>>>>
>>>>>>
>>>>>
>>>>> These are ordered pairs.  I would like to get something that shows that the
>>>>> pair (4,4) appears in both.  I have tried cbind with match and %in% and
>>>>> intersect, but not getting the exact results.
>>>>>
>>>>> Any suggestions would be appreciated.  I have a feeling that it's something
>>>>> really easy that I'm just not seeing.
>>>>>
>>>>> Thanks,
>>>>> Erin
>>>>>
>>>>>
>>>>> Erin Hodgess, PhD
>>>>> mailto: erinm.hodgess using gmail.com
>>>>>
>>>>>           [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> ______________________________________________
>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>