[R] return first index for each unique value in a vector

Bert Gunter gunter.berton at gene.com
Wed Aug 29 00:52:21 CEST 2012


Sheesh!

I would have thought that someone would have noticed that on the
?unique Help page there is a link to ?duplicated, which gives a
_logical_ vector of the duplicates. From this, everything else can be
quickly derived -- and packaged in a simple Matlab like function, if
you insist on that. e.g.

unik <- !duplicated(A)  ## logical vector of unique values
seq_along(A)[unik]  ## indices
A[unik] ## the values

If you want the indices in increasing order, see ?order

-- Bert

On Tue, Aug 28, 2012 at 3:32 PM, R. Michael Weylandt
<michael.weylandt at gmail.com> wrote:
> On Tue, Aug 28, 2012 at 2:58 PM, Bronwyn Rayfield
> <bronwynrayfield at gmail.com> wrote:
>> I would like to efficiently find the first index of each unique value in a
>> very large vector.
>>
>> For example, if I have a vector
>>
>> A<-c(9,2,9,5)
>>
>> I would like to return not only the unique values (2,5,9) but also their
>> first indices (2,4,1).
>>
>> I tried using a for loop with which(A==unique(A)[i])[1] to find the first
>> index of each unique value but it is very slow.
>
> You'll get marginally more speed from which.max() but I'm sure there's
> a better way. I'll write if I can think of it.
>
> Michael
>
>>
>> What I am trying to do is easily and quickly done with the "unique"
>> function in MATLAB (see
>> http://www.mathworks.com/help/techdoc/ref/unique.html).
>>
>> Thank you for your help,
>> Bronwyn
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm




More information about the R-help mailing list