[R] prop.table on three-way table?

Marc Schwartz MSchwartz at mn.rr.com
Thu Apr 20 15:10:52 CEST 2006


On Thu, 2006-04-20 at 10:28 +0200, Fredrik Karlsson wrote:
> Hi marc,
> 
> I did not manage ctab to do this for me. Again, I am probably using it
> wrong, but I don't know that the problem is.
> 
> You asked for a more illustrated example, so here goes:
> 
> Take this table:
> 
> > ftable(table(sample(paste("dim1_no",1:5,sep=""),10,replace=TRUE),
> sample(paste("dim2_no",1:5,sep=""),10,replace=TRUE),
> sample(paste("dim3_no",1:5,sep=""),10,replace=TRUE)))
> 
>                    dim3_no1 dim3_no2 dim3_no3 dim3_no5
> 
> dim1_no1 dim2_no1         0        0        0        0
>          dim2_no2         0        0        2        0
>          dim2_no3         0        0        0        0
>          dim2_no4         0        0        0        0
>          dim2_no5         0        0        0        0
> dim1_no2 dim2_no1         0        0        0        0
>          dim2_no2         0        0        0        0
>          dim2_no3         0        0        0        0
>          dim2_no4         0        0        0        0
>          dim2_no5         0        1        0        1
> dim1_no3 dim2_no1         0        0        0        0
>          dim2_no2         0        0        0        0
>          dim2_no3         0        0        1        0
>          dim2_no4         0        0        0        0
>          dim2_no5         0        0        0        0
> dim1_no4 dim2_no1         1        0        0        0
>          dim2_no2         0        0        0        0
>          dim2_no3         0        0        0        0
>          dim2_no4         0        0        0        0
>          dim2_no5         1        0        0        0
> dim1_no5 dim2_no1         0        0        0        0
>          dim2_no2         0        1        0        0
>          dim2_no3         1        0        0        0
>          dim2_no4         0        1        0        0
>          dim2_no5         0        0        0        0
> >
> 
> Now, I would like to get the per cent occurrence of each level of
> dim3_noX witin the cells by dim1 and dim2. Thus, for this part section
> of the table above:
> 
>                    dim3_no1 dim3_no2 dim3_no3 dim3_no5
> dim1_no2 dim2_no1         0        0        0        0
>          dim2_no2         0        0        0        0
>          dim2_no3         0        0        0        0
>          dim2_no4         0        0        0        0
>          dim2_no5         0        1        0        1
> 
> I would like to get:
> 
>                    dim3_no1 dim3_no2 dim3_no3 dim3_no5
> dim1_no2 dim2_no1         0        0        0        0
>          dim2_no2         0        0        0        0
>          dim2_no3         0        0        0        0
>          dim2_no4         0        0        0        0
>          dim2_no5         0        0.5        0        0.5
> 
> since dim3_no2 represented 50% of the frequency within the cell
> created by dim1_no2 and dim2_no5.
> 
> 
> Hope that helped clarify my previous explanation of the problem.


Fredrik,

If I correctly understand what you are doing, which seems to be to
calculate row based percentages (actually proportions) within each
subgroup, the following should do it. 

Note that I am using set.seed() so that you can reproduce the data in
question here.


library(catspec)

set.seed(1)

x <- table(sample(paste("dim1_no", 1:5, sep = ""), 10, replace = TRUE),
           sample(paste("dim2_no", 1:5, sep = ""), 10, replace = TRUE),
           sample(paste("dim3_no", 1:5, sep = ""), 10, replace = TRUE))

ctab(x, type = "row", percentages = FALSE)



Thus, 'x' is:

> x
, ,  = dim3_no1

          
           dim2_no1 dim2_no3 dim2_no4 dim2_no5
  dim1_no1        0        0        0        0
  dim1_no3        0        0        0        0
  dim1_no4        0        0        1        0
  dim1_no5        0        1        0        0

, ,  = dim3_no2

          
           dim2_no1 dim2_no3 dim2_no4 dim2_no5
  dim1_no1        0        1        0        0
  dim1_no3        0        0        0        0
  dim1_no4        1        0        0        0
  dim1_no5        0        0        0        0

, ,  = dim3_no3

          
           dim2_no1 dim2_no3 dim2_no4 dim2_no5
  dim1_no1        0        1        0        0
  dim1_no3        0        0        2        1
  dim1_no4        0        0        0        0
  dim1_no5        0        0        0        0

, ,  = dim3_no4

          
           dim2_no1 dim2_no3 dim2_no4 dim2_no5
  dim1_no1        0        0        0        0
  dim1_no3        0        0        0        0
  dim1_no4        0        0        1        0
  dim1_no5        0        0        0        0

, ,  = dim3_no5

          
           dim2_no1 dim2_no3 dim2_no4 dim2_no5
  dim1_no1        0        0        0        0
  dim1_no3        0        0        1        0
  dim1_no4        0        0        0        0
  dim1_no5        0        0        0        0



More concisely viewed as:

> ftable(x)
                   dim3_no1 dim3_no2 dim3_no3 dim3_no4 dim3_no5
                                                               
dim1_no1 dim2_no1         0        0        0        0        0
         dim2_no3         0        1        1        0        0
         dim2_no4         0        0        0        0        0
         dim2_no5         0        0        0        0        0
dim1_no3 dim2_no1         0        0        0        0        0
         dim2_no3         0        0        0        0        0
         dim2_no4         0        0        2        0        1
         dim2_no5         0        0        1        0        0
dim1_no4 dim2_no1         0        1        0        0        0
         dim2_no3         0        0        0        0        0
         dim2_no4         1        0        0        1        0
         dim2_no5         0        0        0        0        0
dim1_no5 dim2_no1         0        0        0        0        0
         dim2_no3         1        0        0        0        0
         dim2_no4         0        0        0        0        0
         dim2_no5         0        0        0        0        0



and...the output of using ctab() is:

> ctab(x, type = "row", percentages = FALSE)
                   dim3_no1 dim3_no2 dim3_no3 dim3_no4 dim3_no5
                                                               
dim1_no1 dim2_no1       NaN      NaN      NaN      NaN      NaN
         dim2_no3      0.00     0.50     0.50     0.00     0.00
         dim2_no4       NaN      NaN      NaN      NaN      NaN
         dim2_no5       NaN      NaN      NaN      NaN      NaN
dim1_no3 dim2_no1       NaN      NaN      NaN      NaN      NaN
         dim2_no3       NaN      NaN      NaN      NaN      NaN
         dim2_no4      0.00     0.00     0.67     0.00     0.33
         dim2_no5      0.00     0.00     1.00     0.00     0.00
dim1_no4 dim2_no1      0.00     1.00     0.00     0.00     0.00
         dim2_no3       NaN      NaN      NaN      NaN      NaN
         dim2_no4      0.50     0.00     0.00     0.50     0.00
         dim2_no5       NaN      NaN      NaN      NaN      NaN
dim1_no5 dim2_no1       NaN      NaN      NaN      NaN      NaN
         dim2_no3      1.00     0.00     0.00     0.00     0.00
         dim2_no4       NaN      NaN      NaN      NaN      NaN
         dim2_no5       NaN      NaN      NaN      NaN      NaN


Note that for rows where the total is 0, you end up with NaN (Not a
Number), as opposed to 0.


Does that get you want you want?

HTH,

Marc Schwartz




More information about the R-help mailing list