[Rd] Are r2dtable and C_r2dtable behaving correctly?

Peter Dalgaard pdalgd at gmail.com
Fri Aug 25 11:43:40 CEST 2017


> On 25 Aug 2017, at 10:30 , Martin Maechler <maechler at stat.math.ethz.ch> wrote:
> 
[...]
> https://stackoverflow.com/questions/37309276/r-r2dtable-contingency-tables-are-too-concentrated
> 
> 
>> set.seed(1); system.time(tabs <- r2dtable(1e6, c(100, 100), c(100, 100))); A11 <- vapply(tabs, function(x) x[1, 1], numeric(1))
>   user  system elapsed 
>  0.218   0.025   0.244 
>> table(A11)
> 
>    34     35     36     37     38     39     40     41     42     43 
>     2     17     40    129    334    883   2026   4522   8766  15786 
>    44     45     46     47     48     49     50     51     52     53 
> 26850  42142  59535  78851  96217 107686 112438 108237  95761  78737 
>    54     55     56     57     58     59     60     61     62     63 
> 59732  41474  26939  16006   8827   4633   2050    865    340    116 
>    64     65     66     67 
>    38     13      7      1 
>> 
> 
> For a  2x2  table, there's really only one degree of freedom,
> hence the above characterizes the full distribution for that
> case.
> 
> I would have expected to see all possible values in  0:100
> instead of such a "normal like" distribution with carrier only
> in [34, 67].

Hmm, am I missing a point here?

> round(dhyper(0:100,100,100,100)*1e6)
  [1]      0      0      0      0      0      0      0      0      0      0
 [11]      0      0      0      0      0      0      0      0      0      0
 [21]      0      0      0      0      0      0      0      0      0      0
 [31]      0      0      0      1      4     13     43    129    355    897
 [41]   2087   4469   8819  16045  26927  41700  59614  78694  95943 108050
 [51] 112416 108050  95943  78694  59614  41700  26927  16045   8819   4469
 [61]   2087    897    355    129     43     13      4      1      0      0
 [71]      0      0      0      0      0      0      0      0      0      0
 [81]      0      0      0      0      0      0      0      0      0      0
 [91]      0      0      0      0      0      0      0      0      0      0
[101]      0


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list