[R] selecting values that are unique, instead of selecting unique values

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Jun 25 18:59:35 CEST 2008


On Wed, 25 Jun 2008, Gabor Csardi wrote:

> Wow, that is smart, although is seems to be overkill.....
> I guess 'duplicated' is better than O(n^2), is it really?

Yes as it hashes, but the overhead on short vectors is high since it 
always hashes.

>
> Gabor
>
> On Wed, Jun 25, 2008 at 05:43:30PM +0100, Prof Brian Ripley wrote:
>> On Wed, 25 Jun 2008, Marc Schwartz wrote:
>>
>>> on 06/25/2008 11:19 AM Daren Tan wrote:
>>>>
>>>> unique(c(1:10,1)) gives 1:10 (i.e. unique values), is there any
>>>> method to get only 2:10 (i.e. values that are unique) ?
>>>>
>>>
>>> The easiest might be:
>>>
>>>> Vec
>>> [1]  1  2  3  4  5  6  7  8  9 10  1
>>>
>>>> Vec[table(Vec) == 1]
>>> [1]  2  3  4  5  6  7  8  9 10
>>
>> I don't think that is right: you are relying on recycling indices.  Try
>>
>> Vec <- c(1,1:10)
>> Vec[table(Vec) == 1]
>>
>> which should be the same.
>>
>> I was about to write
>>
>> tab <- table(Vec)
>> names(tab)[tab==1]
>>
>> but that gives a character vector.  Here's a different way:
>>
>> Vec[rowSums(outer(Vec, Vec, "=="))==1]

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list