[R] help matching rows of a data frame

K. Elo maillists at pp.inet.fi
Mon Sep 18 16:32:33 CEST 2017


Hi!
2017-09-18 07:13 -0500, Therneau, Terry M., Ph.D. wrote:
> This question likely has a 1 line answer, I'm just not seeing
> it.  (2, 3, or 10 lines is 
> fine too.)
> 
> For a vector I can do group  <- match(x, unqiue(x)) to get a vector
> that labels each 
> element of x.

Actually, you get a vector of indices matching 'unique(x)', not a
labelled vector.

> x<-c("A","B","C","A","C","D")
> group<-match(x, unique(x))
> group
[1] 1 2 3 1 3 4

> What is an equivalent if x is a data frame?

So you will generate an index where duplicated rows have the row index
of the first occurrence, right? This could work:

> x<-data.frame("X0"=c("A","B","C","C","D","A"), "X1"=c(1,2,1,1,3,1))
> group<-rownames(x)
> for (i in 1:(nrow(x)-1)) { 
     for (j in (i+1):nrow(x)) { 
        if (sum(as.numeric(x[i,]==x[j,]))==ncol(x)) { 
           group[j]<-group[i] }
     }
   }
>  group
[1] "1" "2" "3" "3" "5" "1"

HTH,
Kimmo



More information about the R-help mailing list