[R] apply and table

peter dalgaard pdalgd at gmail.com
Sun May 19 19:27:22 CEST 2013


On May 19, 2013, at 16:22 , Jinsong Zhao wrote:

> Hi there,
> 
> I have the following code:
> 
> z <- matrix(c("A", "A", "B", "B", "C", "C", "A", "B", "C"), ncol = 3)
> apply(z, 2, table, c("A", "B", "C"))
> 
> which give correct results.
> 
> However, the following code:
> 
> apply(z[,1,drop=FALSE], 2, table, c("A", "B", "C"))
> 
> which does not give what I expect. I have been thought it should give the same result as:
> 
> apply(z, 2, table, c("A", "B", "C"))[[1]]
> 
> What's the difference? Does apply not apply to column vector?

To clue the casual reader in, the former gives:

> apply(z, 2, table, c("A", "B", "C"))
[[1]]
   
    A B C
  A 1 1 0
  B 0 0 1

[[2]]
   
    A B C
  B 1 0 0
  C 0 1 1

[[3]]
   
    A B C
  A 1 0 0
  B 0 1 0
  C 0 0 1

whereas the latter gives the first of the tables strung out as a 6x1 matrix.

This is a generic awkwardness of apply(). It tries to simplify the result (similar to sapply), so if the result for all columns have the same length (say, k), it converts them to a (k x C) matrix. If the results are incommensurable, it gives up and returns a list. 

So if we modify the code to always give a 3x3 matrix, the following happens:

> ABC <- LETTERS[1:3]
> apply(z, 2, function(x) table(factor(x, levels=ABC), ABC))
      [,1] [,2] [,3]
 [1,]    1    0    1
 [2,]    0    1    0
 [3,]    0    0    0
 [4,]    1    0    0
 [5,]    0    0    1
 [6,]    0    1    0
 [7,]    0    0    0
 [8,]    1    0    0
 [9,]    0    1    1

(This, incidentally, also answers your question below.)

You can't turn simplification off in apply(), but a passable workaround is

> tapply(z, col(z), function(x) table(factor(x, levels=ABC), ABC))
$`1`
   ABC
    A B C
  A 1 1 0
  B 0 0 1
  C 0 0 0

$`2`
   ABC
    A B C
  A 0 0 0
  B 1 0 0
  C 0 1 1

$`3`
   ABC
    A B C
  A 1 0 0
  B 0 1 0
  C 0 0 1



> 
> Another question: how to output the table in squared matrix (or data frame)? For example:
> 
> > table(c("C", "B", "B"), c("A", "B", "C"))
> 
>    A B C
>  B 0 1 1
>  C 1 0 0
> 
> I hope to get the result something like:
> 
>    A B C
>  A 0 0 0
>  B 0 1 1
>  C 1 0 0
> 
> Is there a way that can output that?
> 
> Any suggestions will be really appreciated. Thanks in advance.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list