[R] populating matrix with binary variable after matching data from data frame

Wed Aug 13 23:33:24 CEST 2014

Hello again. sorry for question again.

may be I was not clear in asking before.

 I don't want to remove rows from matrix, since row names and column
names are identical in matrix.

I tried your suggestion and here is what I get:

> fx <- function(x,x1){
+ i <- as.matrix(x1[,c("V1","V2")])
+ x[i]<-1
+ x
+ }
> fx(x, x1)

Error in `[<-`(`*tmp*`, i, value = 1) : subscript out of bounds

> x[1:4,1:4]
       ABCA10 ABCA12 ABCA13 ABCA4
ABCA10      0      0      0     0
ABCA12      0      0      0     0
ABCA13      0      0      0     0
ABCA4       0      0      0     0

> x1[1:10,]
      V1       V2
1   AKT3    TCL1A
2  AKTIP    VPS41
3  AKTIP    PDPK1
4  AKTIP   GTF3C1
5  AKTIP    HOOK2
6  AKTIP    POLA2
7  AKTIP KIAA1377
8  AKTIP FAM160A2
9  AKTIP    VPS16
10 AKTIP    VPS18

For instance, now I will loop over x1, I go to first row, I get V1 and
check if if I have a row in x that have item in V1 and then check V2
exist in colnames, if match then I assign 1. If not I go to row 2.

In some rows, it is possible that I will only see element in V2 that
exist in row names  and since element in V1 does not exist in X
matrix, I will give 0. (since matrix X has identical row and column
names, i feel it does not matter to check an element in column names
after we check in row names)

now for instance, If in X1 if I see ABCA10 in x1$V1 and ABCA10 in
x1$V2 then in matrix X column 1 and row 1  should get 1.

dput - follows..

x <- structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), .Dim = c(4L,
4L), .Dimnames = list(c("ABCA10", "ABCA12", "ABCA13", "ABCA4"
), c("ABCA10", "ABCA12", "ABCA13", "ABCA4")))

x1 <- structure(list(V1 = c("AKT3", "AKTIP", "AKTIP", "AKTIP", "AKTIP",
"AKTIP", "AKTIP", "AKTIP", "AKTIP", "AKTIP"), V2 = c("TCL1A",
"VPS41", "PDPK1", "GTF3C1", "HOOK2", "POLA2", "KIAA1377", "FAM160A2",
"VPS16", "VPS18")), .Names = c("V1", "V2"), row.names = c(NA,
10L), class = "data.frame")

Thanks for your time.

On Wed, Aug 13, 2014 at 12:51 PM, William Dunlap <wdunlap at tibco.com> wrote:
> You can replace the loop
>> for (i in nrow(x1)) {
>>    x[x1$V1[i], x1$V2[i]] <- 1;
>> }
> by
> f <- function(x, x1) {
>   i <- as.matrix(x1[, c("V1","V2")]) # 2-column matrix to use as a subscript
>   x[ i ] <- 1
>   x
> }
> f(x, x1)
>
> You will get an error if not all the strings in the subscript matrix
> are in the row or
> column names of x.  What do you want to happen in this case.  You can choose
> to first omit the bad rows in the subscript matrix
>     goodRows <- is.element(i[,1], dimnames(x)[1]) &  is.element(i[,2],
> dimnames(x)[2])
>     i <- i[goodRows, , drop=FALSE]
>     x[ i ] <- 1
> or you can choose to expand x to include all the names found in x1.
>
> It would be good if you included some toy data to better illustrate
> what you want to do.
> E.g., with
>   x <- array(0, c(3,3), list(Row=paste0("R",1:3),Col=paste0("C",1:3)))
>   x1 <- data.frame(V1=c("R1","R3"), V2=c("C2","C1"))
> the above f() gives
>> f(x, x1)
>     Col
> Row  C1 C2 C3
>   R1  0  1  0
>   R2  0  0  0
>   R3  1  0  0
> Is that what you are looking for?