[Rd] Possible bug in fisher.test() (PR#14196)
(Ted Harding)
Ted.Harding at manchester.ac.uk
Wed Jan 27 19:14:59 CET 2010
On 27-Jan-10 17:30:10, nhorton at smith.edu wrote:
># is there a bug in the calculation of the odds ratio in fisher.test?
># Nicholas Horton, nhorton at smith.edu Fri Jan 22 08:29:07 EST 2010
>
> x1 = c(rep(0, 244), rep(1, 209))
> x2 = c(rep(0, 177), rep(1, 67), rep(0, 169), rep(1, 40))
>
> or1 = sum(x1==1&x2==1)*sum(x1==0&x2==0)/
> (sum(x1==1&x2==0)*sum(x1==0&x2==1))
>
> library(epitools)
> or2 = oddsratio.wald(x1, x2)$measure[2,1]
>
> or3 = fisher.test(x1, x2)$estimate
>
># or1=or2 = 0.625276, but or3=0.6259267!
>
> I'm running R 2.10.1 under Mac OS X 10.6.2.
> Nick
Not so. Look closely at ?fisher.test:
Value:
[...]
estimate: an estimate of the odds ratio. Note that the
_conditional_ Maximum Likelihood Estimate (MLE)
rather than the unconditional MLE (the sample
odds ratio) is used. Only present in the 2 by 2 case.
Your or1 (and presumably the epitools value also) is the sample OR.
The conditional MLE is the value of rho (the OR) that maximises
the probability of the table *conditional* on the margins.
In this case it differs slightly from the sample OR (by 0.1%).
For smaller tables it will tend to differ even more, e.g.
M1 <- matrix(c(4,7,17,18),nrow=2)
M1
# [,1] [,2]
# [1,] 4 17
# [2,] 7 18
(4*18)/(17*7)
# [1] 0.605042
fisher.test(M1)$estimate
# odds ratio
# 0.6116235 ## (1.1% larger than sample OR)
M2 <- matrix(c(1,2,4,5),nrow=2)
M2
# [,1] [,2]
# [1,] 1 4
# [2,] 2 5
(1*5)/(4*2)
# [1] 0.625
fisher.test(M2)$estimate
# odds ratio
# 0.649423 ## (3.9% larger than sample OR)
The probability of a table matrix(c(a,b,c,d),nrow=2) given
the marginals (a+b),(a+c),(b+c) and hence also (c+d) is
a function of the odds ratio only. Again see ?fisher.test:
"given all marginal totals fixed, the first element of
the contingency table has a non-central hypergeometric
distribution with non-centrality parameter given by
the odds ratio (Fisher, 1935)."
The value of the odds ratio which maximises this (for given
observed 'a') is not the sample OR.
Hoping this helps,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 27-Jan-10 Time: 18:14:57
------------------------------ XFMail ------------------------------
More information about the R-devel
mailing list