[R] Bug in gmodels CrossTable()?
Jakson Alves de Aquino
jaksonaquino at gmail.com
Sun May 31 16:27:38 CEST 2009
Dear Marc Schwartz,
You are correct: there is no bug in CrossTable(). To get what I want I
should have done:
CrossTable(round(xtabs(wgt ~ abc + def)), prop.r = F, prop.c = F,
prop.t = F, prop.chisq = F)
Thank you for the explanation!
Jakson
Marc Schwartz wrote:
> On May 31, 2009, at 7:51 AM, Jakson Alves de Aquino wrote:
>
>> Is the code below showing a bug in Crosstable()? My expectation was that
>> the values produced by xtabs were rounded instead of truncated:
>>
>> library(gmodels)
>> abc <- c("a", "a", "b", "b", "c", "c")
>> def <- c("d", "e", "f", "f", "d", "e")
>> wgt <- c(0.8, 0.6, 0.4, 0.5, 1.4, 1.3)
>>
>> xtabs(wgt ~ abc + def)
>>
>> CrossTable(xtabs(wgt ~ abc + def), prop.r = F, prop.c = F,
>> prop.t = F, prop.chisq = F)
>
>
> CrossTable() is designed to take one or two vectors, which are then
> [cross-]tabulated to yield integer counts, OR a matrix of integer
> counts, not fractional values. In the latter case, it is presumed that
> the matrix is the result of an 'a priori' cross-tabulation operation
> such as the use of table().
>
> The output of xtabs() above is:
>
>> xtabs(wgt ~ abc + def)
> def
> abc d e f
> a 0.8 0.6 0.0
> b 0.0 0.0 0.9
> c 1.4 1.3 0.0
>
>
>
> The relevant output of CrossTable() in your example above shows:
>
>
> | def
> abc | d | e | f | Row Total |
> -------------|-----------|-----------|-----------|-----------|
> a | 0 | 0 | 0 | 1 |
> -------------|-----------|-----------|-----------|-----------|
> b | 0 | 0 | 0 | 0 |
> -------------|-----------|-----------|-----------|-----------|
> c | 1 | 1 | 0 | 2 |
> -------------|-----------|-----------|-----------|-----------|
> Column Total | 2 | 1 | 0 | 5 |
> -------------|-----------|-----------|-----------|-----------|
>
>
>
> The internal table object that would be generated here is effectively:
>
>> addmargins(xtabs(wgt ~ abc + def))
> def
> abc d e f Sum
> a 0.8 0.6 0.0 1.4
> b 0.0 0.0 0.9 0.9
> c 1.4 1.3 0.0 2.7
> Sum 2.2 1.9 0.9 5.0
>
>
>
> The textual output of CrossTable() is internally formatted using
> formatC(..., format = "d"), which is an integer based format:
>
>> formatC(addmargins(xtabs(wgt ~ abc + def)), format = "d")
> def
> abc d e f Sum
> a 0 0 0 1
> b 0 0 0 0
> c 1 1 0 2
> Sum 2 1 0 5
>
>
>
> In other words, you are getting the integer coerced values of the
> individual cells and then the same for the column, row and table totals:
>
>> matrix(as.integer(addmargins(xtabs(wgt ~ abc + def))), 4, 4)
> [,1] [,2] [,3] [,4]
> [1,] 0 0 0 1
> [2,] 0 0 0 0
> [3,] 1 1 0 2
> [4,] 2 1 0 5
>
>
>
> If you review ?as.integer, you will note the following in the 'Value'
> section:
>
> Non-integral numeric values are truncated towards zero (i.e.,
> as.integer(x) equals trunc(x) there)
>
>
>
> The output is correct, if confusing, but you are really using the
> function in a fashion that is not intended.
>
> HTH,
>
> Marc Schwartz
>
>
More information about the R-help
mailing list