[R] return first index for each unique value in a vector

Wed Aug 29 01:21:08 CEST 2012

HI,

I was thinking about duplicated().  But, Bert already posted the solution.  The solution below is not very efficient.
A<-c(9,2,9,5)
unik<-as.numeric(names(table(A)))
match(unik,A)
#[1] 2 4 1

#Bert's solution wins here.
system.time({
set.seed(1)
A<-sample(1:5,1e6,replace=TRUE)
unik <- !duplicated(A)  ## logical vector of unique values
seq_along(A)[unik]  ## indices
A[unik]})
 user  system elapsed 
  0.040   0.016   0.056 
#My solution
system.time({
set.seed(1)
A<-sample(1:5,1e6,replace=TRUE)
#unik<-as.numeric(names(table(A)))
match(as.numeric(names(table(A))),A)})
 user  system elapsed 
 0.344   0.036   0.383 
#Robert's solution
 system.time({
set.seed(1)
A<-sample(1:5,1e6,replace=TRUE)
as.numeric(rownames(unique(data.frame(A)[1])))})
 user  system elapsed 
  0.056   0.012   0.069 
A.K.

----- Original Message -----
From: Bronwyn Rayfield <bronwynrayfield at gmail.com>
To: r-help at r-project.org
Cc: 
Sent: Tuesday, August 28, 2012 3:58 PM
Subject: [R] return first index for each unique value in a vector

I would like to efficiently find the first index of each unique value in a
very large vector.

For example, if I have a vector

A<-c(9,2,9,5)

I would like to return not only the unique values (2,5,9) but also their
first indices (2,4,1).

I tried using a for loop with which(A==unique(A)[i])[1] to find the first
index of each unique value but it is very slow.

What I am trying to do is easily and quickly done with the "unique"
function in MATLAB (see
http://www.mathworks.com/help/techdoc/ref/unique.html).

Thank you for your help,
Bronwyn

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.