[R] Converting Numerical Matrix to List of Strings
Douglas Bates
bates at stat.wisc.edu
Sun Jan 11 17:30:02 CET 2009
On Sun, Jan 11, 2009 at 9:38 AM, Gundala Viswanath <gundalav at gmail.com> wrote:
> Hi all,
>
> Given a matrix:
>
>> mat
>
> [,1] [,2] [,3]
> [1,] 0 0 0
> [2,] 3 3 3
> [3,] 1 1 1
> [4,] 2 1 1
> How can I convert it to a list of strings:
>> desired_output
> [1] "aaa" "ttt" "ccc" "gcc"
Are you looking for a general solution or do you want something
specific for these 64 potential codon-like patterns? If you just want
the patterns corresponding to all possible triplets of A, C, G, T then
colSums(4^(0:2) * t(mat)) + 1
gives you a set of indices between 1 and 64. Then you need to create
the 64 possible patterns. Here is one way
> bases <- factor(c("A","C","G","T"))
> head(patterns <- do.call(paste, expand.grid(bases, bases, bases)))
[1] "A A A" "C A A" "G A A" "T A A" "A C A" "C C A"
> (mat <- matrix(c(0,3,1,2,0,3,1,1,0,3,1,1), ncol = 3))
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 3 3 3
[3,] 1 1 1
[4,] 2 1 1
> colSums(4^(0:2) * t(mat)) + 1
[1] 1 64 22 23
> patterns[colSums(4^(0:2) * t(mat)) + 1]
[1] "A A A" "T T T" "C C C" "G C C"
We will leave the elimination of the blanks in the patterns as an
exercise for the reader.
>
> In principle:
>
> 1. Number of Column in matrix = length of string (= 3)
> 2. Number of Row in matrix = length of vector ( = 4).
> 3. Character "a" encode as "0",
> "c" -> "1",
> "g" -> "2",
> "t" -> "3"
>
>
> Length of strings are assumed to be uniform within the vector,
> and it can be greater than 3 (up to 40 characters).
>
>
> - Gundala Viswanath
> Jakarta - Indonesia
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list