[R] Identify row indices corresponding to each distinct row of a matrix
li li
h@nn@h@hlx @ending from gm@il@com
Fri Nov 9 04:23:42 CET 2018
Thanks. It makes sense.
Jeff Newmiller <jdnewmil using dcn.davis.ca.us> 于2018年11月8日周四 下午8:05写道:
> The duplicated function returns TRUE for rows that have already
> appeared... exactly one of the rows is not represented in the output of
> duplicated. For the intended purpose of removing duplicates this behavior
> is ideal. I have no idea what your intended purpose is, since every row has
> duplicates elsewhere in the matrix. If you really want every set identified
> this way then a loop/apply seems inevitable (most opportunities for
> optimization come about by not visiting every combination).
>
> Cm <- as.matrix( C )
> D <- which( !duplicated( Cm, MARGIN=1 ) )
> nCm <- nrow( Cm )
> F <- lapply( D, function(d) {
> idxrep <- rep( d, nCm )
> which( 0 == unname( rowSums( Cm[idxrep,] != Cm ) ) )
> } )
>
>
> On November 8, 2018 1:42:40 PM PST, li li <hannah.hlx using gmail.com> wrote:
> >Thanks to all the reply. I will try to use plain text in the future.
> >One question regarding using "which( ! duplicated( m, MARGIN=1 ) )".
> >This seems to return the fist row indices corresponding to the distinct
> >rows but it does not give all the row indices
> >corresponding to each of the distinct rows. For example, in the my
> >example
> >below, rows 1, 13 15 are all (1,9).
> >Thanks.
> > Hanna
> >> A <- matrix(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16),8,2)
> >> B <- rbind(A,A,A)
> >> C <- as.data.frame(B[sample(nrow(B)),])
> >> C
> > V1 V2
> >1 1 9
> >2 2 10
> >3 3 11
> >4 5 13
> >5 7 15
> >6 6 14
> >7 4 12
> >8 3 11
> >9 8 16
> >10 5 13
> >11 7 15
> >12 2 10
> >13 1 9
> >14 8 16
> >15 1 9
> >16 3 11
> >17 7 15
> >18 4 12
> >19 2 10
> >20 6 14
> >21 4 12
> >22 8 16
> >23 5 13
> >24 6 14
> >> T <- unique(C)
> >> T
> > V1 V2
> >1 1 9
> >2 2 10
> >3 3 11
> >4 5 13
> >5 7 15
> >6 6 14
> >7 4 12
> >9 8 16
> >>
> >> i <- 1
> >> which(C[,1]==T[i,1]& C[,2]==T[i,2])
> >[1] 1 13 15
> >
> >
> >Bert Gunter <bgunter.4567 using gmail.com> 于2018年11月8日周四 上午10:43写道:
> >
> >> Yes -- much better than mine. I didn't know about the MARGIN argument
> >of
> >> duplicated().
> >>
> >> -- Bert
> >>
> >>
> >> On Wed, Nov 7, 2018 at 10:32 PM Jeff Newmiller
> ><jdnewmil using dcn.davis.ca.us>
> >> wrote:
> >>
> >>> Perhaps
> >>>
> >>> which( ! duplicated( m, MARGIN=1 ) )
> >>>
> >>> ? (untested)
> >>>
> >>> On November 7, 2018 9:20:57 PM PST, Bert Gunter
> ><bgunter.4567 using gmail.com>
> >>> wrote:
> >>> >A mess -- due to your continued use of html formatting.
> >>> >
> >>> >But something like this may do what you want (hard to tell with the
> >>> >mess):
> >>> >
> >>> >> m <- matrix(1:16,nrow=8)[rep(1:8,2),]
> >>> >> m
> >>> > [,1] [,2]
> >>> > [1,] 1 9
> >>> > [2,] 2 10
> >>> > [3,] 3 11
> >>> > [4,] 4 12
> >>> > [5,] 5 13
> >>> > [6,] 6 14
> >>> > [7,] 7 15
> >>> > [8,] 8 16
> >>> > [9,] 1 9
> >>> >[10,] 2 10
> >>> >[11,] 3 11
> >>> >[12,] 4 12
> >>> >[13,] 5 13
> >>> >[14,] 6 14
> >>> >[15,] 7 15
> >>> >[16,] 8 16
> >>> >> vec <- apply(m,1,paste,collapse="-") ## converts rows into
> >character
> >>> >vector
> >>> >> vec
> >>> >[1] "1-9" "2-10" "3-11" "4-12" "5-13" "6-14" "7-15" "8-16" "1-9"
> >>> >"2-10"
> >>> >"3-11" "4-12" "5-13" "6-14"
> >>> >[15] "7-15" "8-16"
> >>> >> ## Then maybe:
> >>> >> tapply(seq_along(vec),vec, I)
> >>> >$`1-9`
> >>> >[1] 1 9
> >>> >
> >>> >$`2-10`
> >>> >[1] 2 10
> >>> >
> >>> >$`3-11`
> >>> >[1] 3 11
> >>> >
> >>> >$`4-12`
> >>> >[1] 4 12
> >>> >
> >>> >$`5-13`
> >>> >[1] 5 13
> >>> >
> >>> >$`6-14`
> >>> >[1] 6 14
> >>> >
> >>> >$`7-15`
> >>> >[1] 7 15
> >>> >
> >>> >$`8-16`
> >>> >[1] 8 16
> >>> >
> >>> >> ## gives the row numbers for each unique row
> >>> >
> >>> >There may well be slicker ways to do this -- if this is actually
> >what
> >>> >you
> >>> >want to do.
> >>> >
> >>> >-- Bert
> >>> >
> >>> >
> >>> >
> >>> >On Wed, Nov 7, 2018 at 7:56 PM li li <hannah.hlx using gmail.com> wrote:
> >>> >
> >>> >> Hi all,
> >>> >> I use the following example to illustrate my question. As you
> >can
> >>> >see,
> >>> >> in matrix C some rows are repeated and I would like to find the
> >>> >indices of
> >>> >> the rows corresponding to each of the distinct rows.
> >>> >> For example, for the row c(1,9), I have used the "which"
> >function
> >>> >to
> >>> >> identify the row indices corresponding to c(1,9). Using this
> >>> >approach, in
> >>> >> order to cover all distinct rows, I need to use a for loop.
> >>> >> I am wondering whether there is an easier way where a for loop
> >can
> >>> >be
> >>> >> avoided?
> >>> >> Thanks very much!
> >>> >> Hanna
> >>> >>
> >>> >>
> >>> >>
> >>> >> > A <- matrix(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16),8,2)> B
> ><-
> >>> >> rbind(A,A,A)> C <- as.data.frame(B[sample(nrow(B)),])> C V1 V2
> >>> >> 1 1 9
> >>> >> 2 2 10
> >>> >> 3 3 11
> >>> >> 4 5 13
> >>> >> 5 7 15
> >>> >> 6 6 14
> >>> >> 7 4 12
> >>> >> 8 3 11
> >>> >> 9 8 16
> >>> >> 10 5 13
> >>> >> 11 7 15
> >>> >> 12 2 10
> >>> >> 13 1 9
> >>> >> 14 8 16
> >>> >> 15 1 9
> >>> >> 16 3 11
> >>> >> 17 7 15
> >>> >> 18 4 12
> >>> >> 19 2 10
> >>> >> 20 6 14
> >>> >> 21 4 12
> >>> >> 22 8 16
> >>> >> 23 5 13
> >>> >> 24 6 14> T <- unique(C)> T V1 V2
> >>> >> 1 1 9
> >>> >> 2 2 10
> >>> >> 3 3 11
> >>> >> 4 5 13
> >>> >> 5 7 15
> >>> >> 6 6 14
> >>> >> 7 4 12
> >>> >> 9 8 16> > i <- 1 > which(C[,1]==T[i,1]&
> >>> >> C[,2]==T[i,2])[1] 1 13 15
> >>> >>
> >>> >> [[alternative HTML version deleted]]
> >>> >>
> >>> >> ______________________________________________
> >>> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> >> PLEASE do read the posting guide
> >>> >> http://www.R-project.org/posting-guide.html
> >>> >> and provide commented, minimal, self-contained, reproducible
> >code.
> >>> >>
> >>> >
> >>> > [[alternative HTML version deleted]]
> >>> >
> >>> >______________________________________________
> >>> >R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> >https://stat.ethz.ch/mailman/listinfo/r-help
> >>> >PLEASE do read the posting guide
> >>> >http://www.R-project.org/posting-guide.html
> >>> >and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>> --
> >>> Sent from my phone. Please excuse my brevity.
> >>>
> >>
>
> --
> Sent from my phone. Please excuse my brevity.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list