[R] Selecting rows of a matrix based on some condition on the columns
David Winsemius
dwinsemius at comcast.net
Fri Mar 5 05:49:52 CET 2010
On Mar 4, 2010, at 10:59 PM, Juliet Ndukum wrote:
> The data set consists of two sets of matrices, as labelled by the
> columns, T's and C's.
>
>> xy
> x T1 T2 T3 T4 T5 C1 C2 C3 C4 C5
> [1,] 50 0.00 0.00 33.75 0.00 0.00 0.00 36.76 0.00 35.26 0.00
> [2,] 13 34.41 0.00 0.00 36.64 32.86 34.11 35.80 37.74 0.00 0.00
> [3,] 14 35.85 0.00 33.88 36.68 34.88 34.58 0.00 32.75 37.45 0.00
> [4,] 33 34.56 0.00 0.00 36.00 0.00 0.00 36.56 0.00 34.83 0.00
> [5,] 66 36.38 37.42 0.00 32.47 34.05 0.00 0.00 0.00 0.00 0.00
> [6,] 22 0.00 0.00 31.07 31.63 37.51 0.00 39.34 34.91 35.51 0.00
> [7,] 25 0.00 0.00 0.00 36.11 34.24 0.00 34.07 32.72 0.00 0.00
> [8,] 9 33.63 0.00 38.43 0.00 35.72 32.95 36.40 38.57 34.19 32.47
> [9,] 87 35.22 0.00 0.00 35.31 0.00 0.00 34.55 35.14 38.12 0.00
> [10,] 99 0.00 0.00 34.94 0.00 0.00 33.54 0.00 34.39 34.54 0.00
>
> First, I wish to select for each row, all columns that have at least
> a T and a C. Based on the code below, I got exactly what I need.
>
>> t1all <- apply(xy,1,function(x) any((x[2]>0|x[3]>0|x[4]>0|x[5]>0|
>> x[6]>0)&(x[7]>0 |x[8]>0 |x[9]>0|x[10]>0|x[11]>0)))
>> mat.t1all <- xy[t1all,]
>> mat.t1all
> x T1 T2 T3 T4 T5 C1 C2 C3 C4 C5
> [1,] 50 0.00 0 33.75 0.00 0.00 0.00 36.76 0.00 35.26 0.00
> [2,] 13 34.41 0 0.00 36.64 32.86 34.11 35.80 37.74 0.00 0.00
> [3,] 14 35.85 0 33.88 36.68 34.88 34.58 0.00 32.75 37.45 0.00
> [4,] 33 34.56 0 0.00 36.00 0.00 0.00 36.56 0.00 34.83 0.00
> [5,] 22 0.00 0 31.07 31.63 37.51 0.00 39.34 34.91 35.51 0.00
> [6,] 25 0.00 0 0.00 36.11 34.24 0.00 34.07 32.72 0.00 0.00
> [7,] 9 33.63 0 38.43 0.00 35.72 32.95 36.40 38.57 34.19 32.47
> [8,] 87 35.22 0 0.00 35.31 0.00 0.00 34.55 35.14 38.12 0.00
> [9,] 99 0.00 0 34.94 0.00 0.00 33.54 0.00 34.39 34.54 0.00
>
> Then, I need the rows for which there are at least two T's and two
> C's. Using a similar code as above, I get the following output:
>
>> t2all <- apply(xy,1,function(x) any(((x[2]>0&x[3]>0)|(x[2]>0&x[4]>0)|
> + (x[2]>0&x[5]>0)|(x[2]>0&x[6]>0)|(x[3]>0&x[4]>0)|(x[3]>0&x[5]>0)|
> + (x[3]>0&x[6]>0)|(x[4]>0&x[5]>0)|(x[4]>0&x[6]>0)|(x[5]>0&x[6]>0))
> +
> + &(( (x[7]>0&x[8]>0)|(x[7]>0&x[9]>0)|(x[7]>0&x[10]>0)|
> (x[7]>0&x[11]>0)|
> + (x[8]>0&x[9]>0)|(x[8]>0&x[10]>0)|(x[8]>0&x[11]>0)|(x[9]>0&x[10]>0)|
> + (x[9]>0&x[11]>0)|(x[10]>0&x[11]>0) ))))
>>
>> mat.t2all <- xy[t2all,]
>> mat.t2all
> x T1 T2 T3 T4 T5 C1 C2 C3 C4 C5
> [1,] 13 34.41 0 0.00 36.64 32.86 34.11 35.80 37.74 0.00 0.00
> [2,] 14 35.85 0 33.88 36.68 34.88 34.58 0.00 32.75 37.45 0.00
> [3,] 33 34.56 0 0.00 36.00 0.00 0.00 36.56 0.00 34.83 0.00
> [4,] 22 0.00 0 31.07 31.63 37.51 0.00 39.34 34.91 35.51 0.00
> [5,] 25 0.00 0 0.00 36.11 34.24 0.00 34.07 32.72 0.00 0.00
> [6,] 9 33.63 0 38.43 0.00 35.72 32.95 36.40 38.57 34.19 32.47
> [7,] 87 35.22 0 0.00 35.31 0.00 0.00 34.55 35.14 38.12 0.00
>
> For three T's and three C's, I got
>
>> t3all <- apply(xy,1,function(x) any(( (x[2]>0&x[3]>0&x[4]>0)|
> + (x[2]>0&x[3]>0&x[5]>0)|(x[2]>0&x[3]>0&x[6]>0)|
> (x[2]>0&x[4]>0&x[5]>0)|
> + (x[2]>0&x[4]>0&x[6])|(x[2]>0&x[5]>0&x[6]>0)|
> + (x[3]>0&x[4]>0&x[5]>0)|(x[3]>0&x[4]>0&x[6]>0)|
> + (x[4]>0&x[5]>0&x[6]>0) )
> +
> + &( (x[7]>0&x[8]>0&x[9]>0)|
> + (x[7]>0&x[8]>0&x[10]>0)|(x[7]>0&x[8]>0&x[11]>0)|
> (x[7]>0&x[9]>0&x[10]>0)|
> + (x[7]>0&x[9]>0&x[11])|(x[7]>0&x[10]>0&x[11]>0)|
> + (x[8]>0&x[9]>0&x[10]>0)|(x[8]>0&x[9]>0&x[11]>0)|
> + (x[9]>0&x[10]>0&x[11]>0) ) ))
>>
>> mat.t3all <- xy[t3all,]
>> mat.t3all
> x T1 T2 T3 T4 T5 C1 C2 C3 C4 C5
> [1,] 13 34.41 0 0.00 36.64 32.86 34.11 35.80 37.74 0.00 0.00
> [2,] 14 35.85 0 33.88 36.68 34.88 34.58 0.00 32.75 37.45 0.00
> [3,] 22 0.00 0 31.07 31.63 37.51 0.00 39.34 34.91 35.51 0.00
> [4,] 9 33.63 0 38.43 0.00 35.72 32.95 36.40 38.57 34.19 32.47
>
>
> Can someone help me with a better, and more efficient code that will
> handle this, thank you in advance for your help.
> JN
> xy <- data.matrix(read.table(textConnection("
+ x T1 T2 T3 T4 T5 C1 C2 C3 C4 C5
+ 50 0.00 0.00 33.75 0.00 0.00 0.00 36.76 0.00 35.26 0.00
+ 13 34.41 0.00 0.00 36.64 32.86 34.11 35.80 37.74 0.00 0.00
+ 14 35.85 0.00 33.88 36.68 34.88 34.58 0.00 32.75 37.45 0.00
+ 33 34.56 0.00 0.00 36.00 0.00 0.00 36.56 0.00 34.83 0.00
+ 66 36.38 37.42 0.00 32.47 34.05 0.00 0.00 0.00 0.00 0.00
+ 22 0.00 0.00 31.07 31.63 37.51 0.00 39.34 34.91 35.51 0.00
+ 25 0.00 0.00 0.00 36.11 34.24 0.00 34.07 32.72 0.00 0.00
+ 9 33.63 0.00 38.43 0.00 35.72 32.95 36.40 38.57 34.19 32.47
+ 87 35.22 0.00 0.00 35.31 0.00 0.00 34.55 35.14 38.12 0.00
+ 99 0.00 0.00 34.94 0.00 0.00 33.54 0.00 34.39 34.54 0.00"),
header=TRUE) )
These two vectors should give more economical summary objects with
which to work:
> rowSums(xy[, grep("T", colnames(xy))] > 0)
[1] 1 3 4 2 4 3 2 3 2 1
> rowSums(xy[, grep("C", colnames(xy))] > 0)
[1] 2 3 3 2 0 3 2 5 3 3
Or if you want to see them side by side:
> cbind(rowSums(xy[, grep("T", colnames(xy))] > 0),
rowSums(xy[, grep("C", colnames(xy))] > 0) )
[,1] [,2]
[1,] 1 2
[2,] 3 3
[3,] 4 3
[4,] 2 2
[5,] 4 0
[6,] 3 3
[7,] 2 2
[8,] 3 5
[9,] 2 3
[10,] 1 3
> --
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list