[R] Test for column equality across matrices

William Dunlap wdunlap at tibco.com
Sun Jul 14 20:22:23 CEST 2013


It looks like match() (and relatives like %in% and is.element) act a bit unpredictably
on lists when the list elements are vectors of numbers of different types.  If you match
integers to integers or doubles to doubles it works as expected, but when the types
don't match the results vary.  I would expect the following to give either c(1,2) or
c(NA,NA) but not c(1,NA):

> match( list( c(13L,15L,16L), c(14L,15L,16L)), list( c(13.,15.,16.), c(14.,15.,16.) ))
[1]  1 NA

It works when the list elements have the same type

> match( list( c(13L,15L,16L), c(14L,15L,16L)), list( c(13L,15L,16L), c(14L,15L,16L) ))
[1] 1 2
> match( list( c(13.,15.,16.), c(14.,15.,16.)), list( c(13.,15.,16.), c(14.,15.,16.) ))
[1] 1 2
> match( list( c(13.,15.,16.), c(14L,15L,16L)), list( c(13.,15.,16.), c(14L,15L,16L) ))
[1] 1 2

So - A and B should be coerced to have a common type ('storage.mode') before
comparing them.

By the way, the discrepency might happen because match() applied to lists might
be implemented by calling deparse on each element of each list and then using
the character method of match.  For sequential integers deparse uses colon notation;
e.g., c(14L,15L,16L) becomes the string "14:16".  But usually deparse puts an 'L' after
integers so they would never match with a double of the same value.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: arun [mailto:smartpink111 at yahoo.com]
> Sent: Saturday, July 13, 2013 10:57 AM
> To: William Dunlap
> Cc: R help; Thiem Alrik
> Subject: Re: [R] Test for column equality across matrices
> 
> I tried it on a slightly bigger dataset:
> A1 <- matrix(t(expand.grid(1:90, 15, 16)), nrow = 3)
> B1 <- combn(90, 3)
> which(is.element(columnsOf(B1), columnsOf(A1)))
> # [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
> #[13] 41481
> 
> 
> which(apply(t(B1),1,paste,collapse="")%in%apply(t(A1),1,paste,collapse=""))
> # [1]  1067  4895  8636 12291 15861 19347 22750 26071 29311 32471 35552 38555
> #[13] 41481 44331
> 
> 
> B1[,44331]
> #[1] 14 15 16
> 
> 
> which(apply(t(A1),1,paste,collapse="")=="141516")
> #[1] 14
> 
> B1New<-B1[,!apply(t(B1),1,paste,collapse="")%in%apply(t(A1),1,paste,collapse="")]
> newB <- B1[ , !is.element(columnsOf(B1), columnsOf(A1))]
>  identical(B1New,newB)
> #[1] FALSE
> 
>  is.element(B1[,44331],A1[,14])
> #[1] TRUE TRUE TRUE
> 
> 
>  B1Sp<-columnsOf(B1)
> B1Sp[[44331]]
> #[1] 14 15 16
>  A1Sp<- columnsOf(A1)
>  A1Sp[[14]]
> #[1] 14 15 16
>  is.element(B1Sp[[44331]],A1Sp[[14]])
> #[1] TRUE TRUE TRUE
> 
> 
> A.K.
> 
> 
> 
> ----- Original Message -----
> From: William Dunlap <wdunlap at tibco.com>
> To: Thiem Alrik <thiem at sipo.gess.ethz.ch>; "mailman, r-help" <r-help at r-project.org>
> Cc:
> Sent: Saturday, July 13, 2013 1:30 PM
> Subject: Re: [R] Test for column equality across matrices
> 
> Try
>    columnsOf <- function(mat) split(mat, col(mat))
>    newB <- B[ , !is.element(columnsOf(B), columnsOf(A))]
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> > Of Thiem Alrik
> > Sent: Saturday, July 13, 2013 6:45 AM
> > To: mailman, r-help
> > Subject: [R] Test for column equality across matrices
> >
> > Dear list,
> >
> > I have two matrices
> >
> > A <- matrix(t(expand.grid(c(1,2,3,4,5), 15, 16)), nrow = 3)
> > B <- combn(16, 3)
> >
> > Now I would like to exclude all columns from the 560 columns in B which are identical
> to
> > any 1 of the 6 columns in A. How could I do this?
> >
> > Many thanks and best wishes,
> >
> > Alrik
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list