[Rd] Are r2dtable and C_r2dtable behaving correctly?
Peter Dalgaard
pdalgd at gmail.com
Fri Aug 25 11:43:40 CEST 2017
> On 25 Aug 2017, at 10:30 , Martin Maechler <maechler at stat.math.ethz.ch> wrote:
>
[...]
> https://stackoverflow.com/questions/37309276/r-r2dtable-contingency-tables-are-too-concentrated
>
>
>> set.seed(1); system.time(tabs <- r2dtable(1e6, c(100, 100), c(100, 100))); A11 <- vapply(tabs, function(x) x[1, 1], numeric(1))
> user system elapsed
> 0.218 0.025 0.244
>> table(A11)
>
> 34 35 36 37 38 39 40 41 42 43
> 2 17 40 129 334 883 2026 4522 8766 15786
> 44 45 46 47 48 49 50 51 52 53
> 26850 42142 59535 78851 96217 107686 112438 108237 95761 78737
> 54 55 56 57 58 59 60 61 62 63
> 59732 41474 26939 16006 8827 4633 2050 865 340 116
> 64 65 66 67
> 38 13 7 1
>>
>
> For a 2x2 table, there's really only one degree of freedom,
> hence the above characterizes the full distribution for that
> case.
>
> I would have expected to see all possible values in 0:100
> instead of such a "normal like" distribution with carrier only
> in [34, 67].
Hmm, am I missing a point here?
> round(dhyper(0:100,100,100,100)*1e6)
[1] 0 0 0 0 0 0 0 0 0 0
[11] 0 0 0 0 0 0 0 0 0 0
[21] 0 0 0 0 0 0 0 0 0 0
[31] 0 0 0 1 4 13 43 129 355 897
[41] 2087 4469 8819 16045 26927 41700 59614 78694 95943 108050
[51] 112416 108050 95943 78694 59614 41700 26927 16045 8819 4469
[61] 2087 897 355 129 43 13 4 1 0 0
[71] 0 0 0 0 0 0 0 0 0 0
[81] 0 0 0 0 0 0 0 0 0 0
[91] 0 0 0 0 0 0 0 0 0 0
[101] 0
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-devel
mailing list