[R] Encoding Vector of Strings into Numerical Matrix

jim holtman jholtman at gmail.com
Tue Jan 6 02:33:00 CET 2009


try this:

> tags <- c("aaa", "ttt", "ccc", "gcc", "atn")
> key <- c(a=0, c=1, g=2, t=3, n=0)
> x <- t(sapply(strsplit(tags, ''), function(z) key[z]))
> x
     a a a
[1,] 0 0 0
[2,] 3 3 3
[3,] 1 1 1
[4,] 2 1 1
[5,] 0 3 0


On Mon, Jan 5, 2009 at 8:26 PM, Gundala Viswanath <gundalav at gmail.com> wrote:
> Dear all,
>
> Given such vector of array.
>
> tags <- c("aaa", "ttt", "ccc", "gcc", "atn")
>
> How can I obtain a matrix corresponding to it
>
>     [,1] [,2] [,3]
> [1,]    0    0    0
> [2,]    3    3    3
> [3,]    1    1    1
> [4,]    2    1    1
> [5,]    0    3   0
>
>
> In principle:
>
> 1. Number of Column in matrix = length of string (= 3)
> 2. Number of Row in matrix = length of vector ( =4).
> 3. Character "a" encode as "0",
>   "c" -> "1",
>   "g" -> "2",
>   "t" -> "3"
>   "n" -> "0"
>
> Length of strings are assumed to be uniform within the vector,
> and it can be greater than 3 (up to 40 characters).
>
> - Gundala Viswanath
> Jakarta - Indonesia
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list