[R] Identify row indices corresponding to each distinct row of a matrix

Fri Nov 9 04:23:42 CET 2018

Thanks. It makes sense.

Jeff Newmiller <jdnewmil using dcn.davis.ca.us> 于2018年11月8日周四 下午8:05写道：

> The duplicated function returns TRUE for rows that have already
> appeared... exactly one of the rows is not represented in the output of
> duplicated. For the intended purpose of removing duplicates this behavior
> is ideal. I have no idea what your intended purpose is, since every row has
> duplicates elsewhere in the matrix. If you really want every set identified
> this way then a loop/apply seems inevitable (most opportunities for
> optimization come about by not visiting every combination).
>
> Cm <- as.matrix( C )
> D <- which( !duplicated( Cm, MARGIN=1 ) )
> nCm <- nrow( Cm )
> F <- lapply( D, function(d) {
>    idxrep <- rep( d, nCm )
>    which( 0 == unname( rowSums( Cm[idxrep,] != Cm ) ) )
>   } )
>
>
> On November 8, 2018 1:42:40 PM PST, li li <hannah.hlx using gmail.com> wrote:
> >Thanks to all the reply. I will try to use plain text in the future.
> >One question regarding using "which( ! duplicated( m, MARGIN=1 ) )".
> >This seems to return the fist row indices corresponding to the distinct
> >rows but it does not give all the row indices
> >corresponding to each of the distinct rows. For example, in the my
> >example
> >below, rows 1, 13 15 are all (1,9).
> >Thanks.
> >  Hanna
> >> A <- matrix(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16),8,2)
> >> B <- rbind(A,A,A)
> >> C <- as.data.frame(B[sample(nrow(B)),])
> >> C
> >   V1 V2
> >1   1  9
> >2   2 10
> >3   3 11
> >4   5 13
> >5   7 15
> >6   6 14
> >7   4 12
> >8   3 11
> >9   8 16
> >10  5 13
> >11  7 15
> >12  2 10
> >13  1  9
> >14  8 16
> >15  1  9
> >16  3 11
> >17  7 15
> >18  4 12
> >19  2 10
> >20  6 14
> >21  4 12
> >22  8 16
> >23  5 13
> >24  6 14
> >> T <- unique(C)
> >> T
> >  V1 V2
> >1  1  9
> >2  2 10
> >3  3 11
> >4  5 13
> >5  7 15
> >6  6 14
> >7  4 12
> >9  8 16
> >>
> >> i <- 1
> >> which(C[,1]==T[i,1]& C[,2]==T[i,2])
> >[1]  1 13 15
> >
> >
> >Bert Gunter <bgunter.4567 using gmail.com> 于2018年11月8日周四 上午10:43写道：
> >
> >> Yes -- much better than mine. I didn't know about the MARGIN argument
> >of
> >> duplicated().
> >>
> >> -- Bert
> >>
> >>
> >> On Wed, Nov 7, 2018 at 10:32 PM Jeff Newmiller
> ><jdnewmil using dcn.davis.ca.us>
> >> wrote:
> >>
> >>> Perhaps
> >>>
> >>> which( ! duplicated( m, MARGIN=1 ) )
> >>>
> >>> ? (untested)
> >>>
> >>> On November 7, 2018 9:20:57 PM PST, Bert Gunter
> ><bgunter.4567 using gmail.com>
> >>> wrote:
> >>> >A mess -- due to your continued use of html formatting.
> >>> >
> >>> >But something like this may do what you want (hard to tell with the
> >>> >mess):
> >>> >
> >>> >> m <- matrix(1:16,nrow=8)[rep(1:8,2),]
> >>> >> m
> >>> >      [,1] [,2]
> >>> > [1,]    1    9
> >>> > [2,]    2   10
> >>> > [3,]    3   11
> >>> > [4,]    4   12
> >>> > [5,]    5   13
> >>> > [6,]    6   14
> >>> > [7,]    7   15
> >>> > [8,]    8   16
> >>> > [9,]    1    9
> >>> >[10,]    2   10
> >>> >[11,]    3   11
> >>> >[12,]    4   12
> >>> >[13,]    5   13
> >>> >[14,]    6   14
> >>> >[15,]    7   15
> >>> >[16,]    8   16
> >>> >> vec <- apply(m,1,paste,collapse="-") ## converts rows into
> >character
> >>> >vector
> >>> >> vec
> >>> >[1] "1-9"  "2-10" "3-11" "4-12" "5-13" "6-14" "7-15" "8-16" "1-9"
> >>> >"2-10"
> >>> >"3-11" "4-12" "5-13" "6-14"
> >>> >[15] "7-15" "8-16"
> >>> >> ## Then maybe:
> >>> >> tapply(seq_along(vec),vec, I)
> >>> >$`1-9`
> >>> >[1] 1 9
> >>> >
> >>> >$`2-10`
> >>> >[1]  2 10
> >>> >
> >>> >$`3-11`
> >>> >[1]  3 11
> >>> >
> >>> >$`4-12`
> >>> >[1]  4 12
> >>> >
> >>> >$`5-13`
> >>> >[1]  5 13
> >>> >
> >>> >$`6-14`
> >>> >[1]  6 14
> >>> >
> >>> >$`7-15`
> >>> >[1]  7 15
> >>> >
> >>> >$`8-16`
> >>> >[1]  8 16
> >>> >
> >>> >> ## gives the row numbers for each unique row
> >>> >
> >>> >There may well be slicker ways to do this -- if this is actually
> >what
> >>> >you
> >>> >want to do.
> >>> >
> >>> >-- Bert
> >>> >
> >>> >
> >>> >
> >>> >On Wed, Nov 7, 2018 at 7:56 PM li li <hannah.hlx using gmail.com> wrote:
> >>> >
> >>> >> Hi all,
> >>> >>    I use the following example to illustrate my question. As you
> >can
> >>> >see,
> >>> >> in matrix C some rows are repeated and I would like to find the
> >>> >indices of
> >>> >> the rows corresponding to each of the distinct rows.
> >>> >>   For example, for the row c(1,9), I have used the "which"
> >function
> >>> >to
> >>> >> identify the row indices corresponding to c(1,9). Using this
> >>> >approach, in
> >>> >> order to cover all distinct rows, I need to use a for loop.
> >>> >>    I am wondering whether there is an easier way where a for loop
> >can
> >>> >be
> >>> >> avoided?
> >>> >>    Thanks very much!
> >>> >>       Hanna
> >>> >>
> >>> >>
> >>> >>
> >>> >> > A <- matrix(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16),8,2)> B
> ><-
> >>> >> rbind(A,A,A)> C <- as.data.frame(B[sample(nrow(B)),])> C   V1 V2
> >>> >> 1   1  9
> >>> >> 2   2 10
> >>> >> 3   3 11
> >>> >> 4   5 13
> >>> >> 5   7 15
> >>> >> 6   6 14
> >>> >> 7   4 12
> >>> >> 8   3 11
> >>> >> 9   8 16
> >>> >> 10  5 13
> >>> >> 11  7 15
> >>> >> 12  2 10
> >>> >> 13  1  9
> >>> >> 14  8 16
> >>> >> 15  1  9
> >>> >> 16  3 11
> >>> >> 17  7 15
> >>> >> 18  4 12
> >>> >> 19  2 10
> >>> >> 20  6 14
> >>> >> 21  4 12
> >>> >> 22  8 16
> >>> >> 23  5 13
> >>> >> 24  6 14> T <- unique(C)> T  V1 V2
> >>> >> 1  1  9
> >>> >> 2  2 10
> >>> >> 3  3 11
> >>> >> 4  5 13
> >>> >> 5  7 15
> >>> >> 6  6 14
> >>> >> 7  4 12
> >>> >> 9  8 16> > i <- 1                    > which(C[,1]==T[i,1]&
> >>> >> C[,2]==T[i,2])[1]  1 13 15
> >>> >>
> >>> >>         [[alternative HTML version deleted]]
> >>> >>
> >>> >> ______________________________________________
> >>> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> >> PLEASE do read the posting guide
> >>> >> http://www.R-project.org/posting-guide.html
> >>> >> and provide commented, minimal, self-contained, reproducible
> >code.
> >>> >>
> >>> >
> >>> >       [[alternative HTML version deleted]]
> >>> >
> >>> >______________________________________________
> >>> >R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> >https://stat.ethz.ch/mailman/listinfo/r-help
> >>> >PLEASE do read the posting guide
> >>> >http://www.R-project.org/posting-guide.html
> >>> >and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>> --
> >>> Sent from my phone. Please excuse my brevity.
> >>>
> >>
>
> --
> Sent from my phone. Please excuse my brevity.
>

	[[alternative HTML version deleted]]