[R] Comparing two matrices
Duncan Murdoch
murdoch at stats.uwo.ca
Thu Jul 6 14:46:51 CEST 2006
On 7/6/2006 8:18 AM, Srinivas Iyyer wrote:
> hi:
>
> I have matrix with dimensions(200 X 20,000). I have
> another file, a tab-delim file where first column
> variables are row names and second column variables
> are column names.
>
>
> For instance:
>
>> tmat
> Apple Orange Mango Grape Star
> A 0 0 0 0 0
> O 0 0 0 0 0
> M 0 0 0 0 0
> G 0 0 0 0 0
> S 0 0 0 0 0
>
>
>
>> tb # tab- delim file.
> V1 V2
> 1 Apple S
> 2 Apple A
> 3 Apple O
> 4 Orange A
> 5 Orange O
> 6 Orange S
> 7 Mango M
> 8 Mango A
> 9 Mango S
>
>
> I have to read each line of the 'tb' (tab delim file),
> take the first variable, check if matches any rowname
> of the matrix. Take the second variable of the row in
> and check if it matches any column name. If so, put
> 1 else leave it.
>
>
> The following is a small piece of code that, I felt is
> a solutions. However, since my original matrix and
> tab-delim file is very very huge, I am not sure if it
> is really doing the correct thing. Could any one
> please help me if I am doing this correct.
>
>
>
>> for(i in 1:length(tb[,1])){
> + r = tb[i,1]
> + c = as.character(tb[i,2])
> + tmat[rownames(tmat)==c,colnames(tmat)==r] <-1
> + }
I think that works, but it's not as fast as some other ways of doing the
same thing. For example, table(tb) will give you a table of the counts
of each pair of entries in tb. pmin(table(tb), 1) will set the maximum
count to 1.
An advantage of this approach is that it will show you if there are any
entries in tb that aren't in your tmat (typos, etc.). A disadvantage is
that if there are any missing categories (e.g. G, Grape, Star in your
sample) they won't show up at all, and you may need some manipulations
to get things to look exactly the way you asked. For example,
> pmin(table(tb))
V2
V1 A M O S
Apple 1 0 1 1
Mango 1 1 0 1
Orange 1 0 1 1
> pmin(table(tb[,2:1]))
V1
V2 Apple Mango Orange
A 1 1 1
M 0 1 0
O 1 0 1
S 1 1 1
Duncan Murdoch
>
>
>
>> tmat
> Apple Orange Mango Grape Star
> A 1 1 1 0 0
> O 1 1 0 0 0
> M 0 0 1 0 0
> G 0 0 0 0 0
> S 1 1 1 0 0
>
>
>
> Thanks.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list