[R] extract columns of a matrix/data frame

Marc Schwartz marc_schwartz at comcast.net
Tue Jul 31 21:59:57 CEST 2007


On Tue, 2007-07-31 at 11:47 -0700, yuvika wrote:
> Hello,
>  
> Thanks for the immediate help. However, I have a question for you.
> let's say the matrix looks like this
>  
> name      a1   a2   b1   b2   c1   c2
> 0            4     2     7     8     1     2
> 0            3     6     9     2     2    9
> 1            2     7      9    2     4     2
> 1            3      2     2     6     7     8
> 2            2      7      8     3    4     2
> 3           4      6      8     9     0     2
> 3           6      8      9     3     6     7
>  
> Now, what i want to do is  still make submatrices but now make 3
> matrices(based on a,b,c  just like before) for name=0, 3 matrices for
> name=1 and so on..
> how can i do this?
>  
> looking forward for your help.
> thanks
> yuvika

Yuvika,

Please be sure to 'reply to all' so that the list thread stays intact
and can be of benefit to others in the archive.  Otherwise knowledge
transfer is lost...

In this case, we can split() the initial matrix based upon the 'name'
column and then still use the initial solution, with modifications. In
effect, we end up with 'nested' loops:


> MAT
     name a1 a2 b1 b2 c1 c2
[1,]    0  4  2  7  8  1  2
[2,]    0  3  6  9  2  2  9
[3,]    1  2  7  9  2  4  2
[4,]    1  3  2  2  6  7  8
[5,]    2  2  7  8  3  4  2
[6,]    3  4  6  8  9  0  2
[7,]    3  6  8  9  3  6  7


We first need to coerce the matrix to a data frame to use this approach:

DF <- as.data.frame(MAT)

> DF
  name a1 a2 b1 b2 c1 c2
1    0  4  2  7  8  1  2
2    0  3  6  9  2  2  9
3    1  2  7  9  2  4  2
4    1  3  2  2  6  7  8
5    2  2  7  8  3  4  2
6    3  4  6  8  9  0  2
7    3  6  8  9  3  6  7


# split() DF by the 'name' column
# strip the 'name' column while we are at it
DF.split <- split(DF[, -1], DF$name)


> DF.split
$`0`
  a1 a2 b1 b2 c1 c2
1  4  2  7  8  1  2
2  3  6  9  2  2  9

$`1`
  a1 a2 b1 b2 c1 c2
3  2  7  9  2  4  2
4  3  2  2  6  7  8

$`2`
  a1 a2 b1 b2 c1 c2
5  2  7  8  3  4  2

$`3`
  a1 a2 b1 b2 c1 c2
6  4  6  8  9  0  2
7  6  8  9  3  6  7


Now use lapply() to navigate the above list, then use the initial
solution with lapply() instead of sapply() on each data frame within the
list:

RES <- lapply(DF.split, 
              function(x) sapply(letters[1:3], 
                                 function(i) x[, grep(i, colnames(x))]))


> RES
$`0`
$`0`[[1]]
  a1 a2
1  4  2
2  3  6

$`0`[[2]]
  b1 b2
1  7  8
2  9  2

$`0`[[3]]
  c1 c2
1  1  2
2  2  9


$`1`
$`1`[[1]]
  a1 a2
3  2  7
4  3  2

$`1`[[2]]
  b1 b2
3  9  2
4  2  6

$`1`[[3]]
  c1 c2
3  4  2
4  7  8


$`2`
$`2`[[1]]
  a1 a2
5  2  7

$`2`[[2]]
  b1 b2
5  8  3

$`2`[[3]]
  c1 c2
5  4  2


$`3`
$`3`[[1]]
  a1 a2
6  4  6
7  6  8

$`3`[[2]]
  b1 b2
6  8  9
7  9  3

$`3`[[3]]
  c1 c2
6  0  2
7  6  7



HTH,

Marc



More information about the R-help mailing list