[R] Union of columns of two matrices

Ling, Gary (Electronic Trading) Gary_Ling at ml.com
Thu Aug 7 01:20:29 CEST 2008


Here is my attempt. I'm not sure if that's the most efficient way to do
it, cause I'm "cheating" using the nice features from R, namely
"duplicated()".

I assume the matrices have same number of rows.

### example ###
### background setup: simulate 2 matrices with some common columns
(A <- cbind(1:4, matrix(rnorm(16),4), 101:104))
#      [,1]       [,2]       [,3]       [,4]      [,5] [,6]
# [1,]    1 -0.5305169 -1.7243920 -0.1722617 1.7343167  101
# [2,]    2 -0.3466017  0.3737072  0.5961296 1.4493053  102
# [3,]    3 -1.7812876 -1.5707614  1.4401485 0.9683144  103
# [4,]    4 -1.7219545  0.4762025 -0.2137656 0.7008253  104
(B <-
cbind(matrix(rnorm(8),4),1:4,matrix(rnorm(12),4),101:104,c(1:3,5)))
#            [,1]       [,2] [,3]       [,4]       [,5]       [,6] [,7]
[,8]
# [1,]  1.1182879  0.5340995    1  1.0434300 -0.5105291 -1.0994476  101
1
# [2,]  0.4031942  0.3156704    2 -0.4704723  0.8367561 -1.6163610  102
2
# [3,] -1.0317547 -0.5642614    3  1.0916636  1.0411857  0.1914676  103
3
# [4,] -0.6036328  3.2339688    4  1.8505135  2.0055947 -0.0359060  104
5

### some auxiliary abstractions
matrix2list <- function(M) lapply(split(M,col(M)), function(c) c)
list2matrix <- function(L) sapply(L, function(c) c)

### Then the problem can be solved in 2 lines 
L <- c(matrix2list(A),matrix2list(B))
list2matrix(L[!duplicated(L)])

# Or even 1 line, but kind of confusing
(function(L)
list2matrix(L[!duplicated(L)]))(c(matrix2list(A),matrix2list(B)))

# output; compare to above, the duplicated columns are gone
#      1          2          3          4         5   6          1
2
# [1,] 1 -0.5305169 -1.7243920 -0.1722617 1.7343167 101  1.1182879
0.5340995
# [2,] 2 -0.3466017  0.3737072  0.5961296 1.4493053 102  0.4031942
0.3156704
# [3,] 3 -1.7812876 -1.5707614  1.4401485 0.9683144 103 -1.0317547
-0.5642614
# [4,] 4 -1.7219545  0.4762025 -0.2137656 0.7008253 104 -0.6036328
3.2339688
#               4          5          6 8
# [1,]  1.0434300 -0.5105291 -1.0994476 1
# [2,] -0.4704723  0.8367561 -1.6163610 2
# [3,]  1.0916636  1.0411857  0.1914676 3
# [4,]  1.8505135  2.0055947 -0.0359060 5

##### end example #####

I'm not sure how "duplicated" is coded in R. If those two lists are
sorted before comparing, then I guess the complexity is O(n). If not,
then it's O(n^2). [n = ncol(L)]

-gary




-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Giuseppe Paleologo
Sent: Wednesday, August 06, 2008 6:33 PM
To: r-help at r-project.org
Subject: [R] Union of columns of two matrices


I was posed the following problem/teaser:

given two matrices, come up with an "elegant" (=fast & short) function
that
returns a matrix with all and only the non-duplicated columns of both
matrices; the column order does not matter. In essence, a matrix
equivalent
of union(x,y), where x and y are vectors. I could not come with anything
nice. Any ideas?

Giuseppe

-- 
Giuseppe A. Paleologo :: Email: paleologo at gmail.com :: AOL: gappy3000 ::
Skype :: gappy3000 :: Gtalk: paleologo :: Mobile: 917.331.3497
fact: 2^32,582,657-1 is a prime

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--------------------------------------------------------

This message w/attachments (message) may be privileged, confidential or proprietary, and if you are not an intended recipient, please notify the sender, do not use or share it and delete it. Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Merrill Lynch. Subject to applicable law, Merrill Lynch may monitor, review and retain e-communications (EC) traveling through its networks/systems. The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or error-free. This message is subject to terms available at the following link: http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you consent to the foregoing.



More information about the R-help mailing list