[R] Bug in gmodels CrossTable()?

Jakson Alves de Aquino jaksonaquino at gmail.com
Sun May 31 16:27:38 CEST 2009


Dear Marc Schwartz,

You are correct: there is no bug in CrossTable(). To get what I want I
should have done:

CrossTable(round(xtabs(wgt ~ abc + def)), prop.r = F, prop.c = F,
  prop.t = F, prop.chisq = F)

Thank you for the explanation!

Jakson


Marc Schwartz wrote:
> On May 31, 2009, at 7:51 AM, Jakson Alves de Aquino wrote:
> 
>> Is the code below showing a bug in Crosstable()? My expectation was that
>> the values produced by xtabs were rounded instead of truncated:
>>
>> library(gmodels)
>> abc <- c("a", "a", "b", "b", "c", "c")
>> def <- c("d", "e", "f", "f", "d", "e")
>> wgt <- c(0.8, 0.6, 0.4, 0.5, 1.4, 1.3)
>>
>> xtabs(wgt ~ abc + def)
>>
>> CrossTable(xtabs(wgt ~ abc + def), prop.r = F, prop.c = F,
>>  prop.t = F, prop.chisq = F)
> 
> 
> CrossTable() is designed to take one or two vectors, which are then
> [cross-]tabulated to yield integer counts, OR a matrix of integer
> counts, not fractional values. In the latter case, it is presumed that
> the matrix is the result of an 'a priori' cross-tabulation operation
> such as the use of table().
> 
> The output of xtabs() above is:
> 
>> xtabs(wgt ~ abc + def)
>    def
> abc   d   e   f
>   a 0.8 0.6 0.0
>   b 0.0 0.0 0.9
>   c 1.4 1.3 0.0
> 
> 
> 
> The relevant output of CrossTable() in your example above shows:
> 
> 
>              | def
>          abc |         d |         e |         f | Row Total |
> -------------|-----------|-----------|-----------|-----------|
>            a |         0 |         0 |         0 |         1 |
> -------------|-----------|-----------|-----------|-----------|
>            b |         0 |         0 |         0 |         0 |
> -------------|-----------|-----------|-----------|-----------|
>            c |         1 |         1 |         0 |         2 |
> -------------|-----------|-----------|-----------|-----------|
> Column Total |         2 |         1 |         0 |         5 |
> -------------|-----------|-----------|-----------|-----------|
> 
> 
> 
> The internal table object that would be generated here is effectively:
> 
>> addmargins(xtabs(wgt ~ abc + def))
>      def
> abc     d   e   f Sum
>   a   0.8 0.6 0.0 1.4
>   b   0.0 0.0 0.9 0.9
>   c   1.4 1.3 0.0 2.7
>   Sum 2.2 1.9 0.9 5.0
> 
> 
> 
> The textual output of CrossTable() is internally formatted using
> formatC(..., format = "d"), which is an integer based format:
> 
>> formatC(addmargins(xtabs(wgt ~ abc + def)), format = "d")
>      def
> abc   d e f Sum
>   a   0 0 0 1
>   b   0 0 0 0
>   c   1 1 0 2
>   Sum 2 1 0 5
> 
> 
> 
> In other words, you are getting the integer coerced values of the
> individual cells and then the same for the column, row and table totals:
> 
>> matrix(as.integer(addmargins(xtabs(wgt ~ abc + def))), 4, 4)
>      [,1] [,2] [,3] [,4]
> [1,]    0    0    0    1
> [2,]    0    0    0    0
> [3,]    1    1    0    2
> [4,]    2    1    0    5
> 
> 
> 
> If you review ?as.integer, you will note the following in the 'Value'
> section:
> 
>   Non-integral numeric values are truncated towards zero (i.e.,
> as.integer(x) equals trunc(x) there)
> 
> 
> 
> The output is correct, if confusing, but you are really using the
> function in a fashion that is not intended.
> 
> HTH,
> 
> Marc Schwartz
> 
>




More information about the R-help mailing list