[R] conditional replacement of elements of matrix with another matrix column

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Thu Sep 2 04:15:38 CEST 2021


Just for the hell of is I looked at the huge amount of data to see the
lengths:

 

> nrow(A)

[1] 8760

> nrow(B)

[1] 734

> sum(is.na(A[, 2]))

[1] 8760

> sum(is.na(B[, 2]))

[1] 0

 

So it seems your first huge matrix has 8,760 rows where the second entry is
always NA.

 

B seems to have 733 unique values out of 734 entries. For what I call a key
and 192 different values mapped into by the keys.

 

> length(unique(B[,1]))

[1] 733

> length(unique(B[,2]))

[1] 192

 

I now conclude the question was badly phrased, as often happens when English
is not the main language used, or the person asking may have provided an
incomplete request, perhaps based on their misunderstanding.

 

First, matrix A has NOTHING anywhere in the second column other than an NA
placeholder. It has umpteen copies of the same number followed by umpteen of
the next and so on. And specifically exactly 24 copies of each!

 

> table(A[,1])

 

17897 17898 17899 17900 17901 17902 17903 17904 17905 17906 17907 17908
17909 17910 17911 17912 

24    24    24    24    24    24    24    24    24    24    24    24    24
24    24    24 

<<SNIP>>

  18249 18250 18251 18252 18253 18254 18255 18256 18257 18258 18259 18260
18261 

24    24    24    24    24    24    24    24    24    24    24    24    24

 

I have no interest in why any of that is but the problem now strikes me as
different. It is not about what to do when A and B have the same value in
column one at all, especially as they are not at all similar. It is about
table lookup, I think.

 

As such, the request is to do something so that you replace the NA in table
A (probably no need to make a C, albeit that works too) by using column2 in
B for whichever one table A in column one matches, using the corresponding
column two.

 

Such a request can be handled quite a few ways BEFORE or after. I mean
instead of making 24 copies in A, you could just make 24 copies of B, and if
needed sort them.  But more generally, there are many R function in base R
that do all kinds of joins such as merge() or in the dplyr/tidyverse package
albeit some of these may be done on data.frames rather than matrices, albeit
they can easily be converted.

 

And of course many alternatives, some painful, involve iterating over one
matrix while searching the other for a match, or setting up B as a
searchable object that simulates a hash or dictionary in other languages,
such as a named structure.

 

For example, make a named vector containing column two with the names of
column 1:

 

You can now look up items in B_vech using the character representation:

 

Here is the first few lines of B:

 

> head(B)

[,1] [,2]

[1,] 13634    3

[2,] 13635   32

[3,] 13637   88

[4,] 13638  126

[5,] 13639    8

[6,] 13640    2

 

Searching for 13635 works fine:

 

> B_vec[as.character(13635)]

13635 

32 

> B_vec[as.character(13636)]

<NA> 

  NA 

> B_vec[as.character(13637)]

13637 

88

 

But since 13636 is not in the vector, it fails.

 

So to convert A (or a copy called C) becomes fairly simple IFF the set of
numbers in A and B are properly set up.

 

A[,2] <- B_vec[as.character(A[,1])]

 

But are they?

 

> range(A[,1])

[1] 17897 18261

> range(B[,1])

[1] 13634 18148

 

But I think I have wasted enough of my time and of everyone who read this
far on a problem that was not explained and may well still not be what I am
guessing. As noted, probably easiest to solve using a merge.

 

 

 

 

From: Eliza Botto <eliza_botto using outlook.com> 
Sent: Wednesday, September 1, 2021 6:00 PM
To: r-help using r-project.org; Mohammad Tanvir Ahamed <mashranga using yahoo.com>; Avi
Gross <avigross using verizon.net>; Richard M. Heiberger <rmh using temple.edu>
Subject: Re: [R] conditional replacement of elements of matrix with another
matrix column

 

I thank you all. But the code doesn't work on my different dataset where A
and B have different column lengths. For example,

 

> dput(A) 

structure(c(17897, 17897, 17897, 17897, 17897, 17897, 17897, 

17897, 17897, 17897, 17897, 17897, 17897, 17897, 17897, 17897, 

<<SNIP>>

NA), .Dim = c(8760L, 2L))

 

 

> dput(B) 

structure(c(13634, 13635, 13637, 13638, 13639, 13640, 13641, 

13642, 13643, 13645, 13646, 13647, 13648, 13649, 13650, 13651, 

<<SNIP>>

214, 156, 240, 29, 2, 374, 36, 4, 18, 419, 2, 5, 3, 277, 340, 

1, 216, 93, 1, 4, 2, 3, 42, 78, 190, 40, 808, 80, 266, 66, 42

), .Dim = c(734L, 2L))

 

Can you please guide me how to implement the given code on this dataset?

I thanyou in advance

  _____  

From: Mohammad Tanvir Ahamed <mashranga using yahoo.com>
Sent: Wednesday 1 September 2021 21:48
To: r-help using r-project.org <r-help using r-project.org>; Eliza Botto
<eliza_botto using outlook.com>
Subject: Re: [R] conditional replacement of elements of matrix with another
matrix column 

 

C1 <- A
C1[,2][which(B[,1]%in%A[,1])] <- B[,2][which(B[,1]%in%A[,1])]


Regards.............
Tanvir Ahamed 






On Wednesday, 1 September 2021, 11:00:16 pm GMT+2, Eliza Botto
<eliza_botto using outlook.com> wrote: 





deaR useRs,

I have the matrix "A" and matrix "B" and I want the matrix "C". Is there a
way of doing it?

> dput(A)

structure(c(12, 12, 12, 13, 13, 13, 14, 14, 14, NA, NA, NA, NA,
NA, NA, NA, NA, NA), .Dim = c(9L, 2L))

> dput(B)

structure(c(11, 11, 11, 13, 13, 13, 14, 14, 14, 6, 7, 8, 9, 10,
11, 12, 13, 14), .Dim = c(9L, 2L))

> dput(C)

structure(c(12, 12, 12, 13, 13, 13, 14, 14, 14, NA, NA, NA, 9,
10, 11, 12, 13, 14), .Dim = c(9L, 2L))

Precisely, I want to replace the elements of 2nd column of A with those of B
provided the elements of 1st column match. Is there a single line loop or
code for that?


Thanks in advance,

Eliza Botto

    [[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


	[[alternative HTML version deleted]]



More information about the R-help mailing list