[Rd] fisher.test() gives wrong confidence interval (PR#4019)
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Sat Aug 30 10:29:59 MEST 2003
jerome at hivnet.ubc.ca writes:
> The problem occurs when the sample odds ratio is Inf, such as in the
> following example. Given the fact that both upper bounds of the two 95%
> confidence intervals are Inf, I would have expected that the two lower
> bounds be equal, but they aren't.
>
> x <- matrix(c(9,4,0,2),2,2)
> x
> # [,1] [,2]
> #[1,] 9 0
> #[2,] 4 2
> rbind("two.sided.95CI"=fisher.test(x)$conf.int,
> "greater.95CI"=fisher.test(x,alt="greater")$conf.int)
> # [,1] [,2]
> #two.sided.95CI 0.2985103 Inf
> #greater.95CI 0.4625314 Inf
>
> Using the noncentral hypergeometric distribution, we can calculate the
> probability mass of each possible table with same marginals as x.
> Ref.: Alan Agresti (1990). Categorical data analysis. New York: Wiley.
> Page 67.
>
> Hence, the result below suggests that the two-sided confidence interval
> has a confidence level of 97.5% as opposed to 95%.
>
> n11 <- 7:9
> theta <- 0.2985103
> choose(9,n11)*choose(15-9,13-n11)*theta^n11/
> sum(choose(9,n11)*choose(15-9,13-n11)*theta^n11)
> #[1] 0.67344877 0.30154709 0.02500414
>
> The 95% confidence interval with one-sided (greater) alternative appears
> to be correct.
>
> theta <- 0.4625314
> choose(9,n11)*choose(15-9,13-n11)*theta^n11/
> sum(choose(9,n11)*choose(15-9,13-n11)*theta^n11)
> #[1] 0.5608724 0.3891316 0.0499960
I don't think this is a bug, insofar as the problem is solvable at
all. The confidence interval consists of those x for which the test of
OR==x is not rejected at the 95% level. In the two sided case, you
need to count also the tables that differ in the "opposite direction",
and we unceremoniously assume that their probability is the same.
For some small tables like this one the assumption is false since
there are only three tables consistent with the marginals. However,
this is not invariably the case, and trying to solve the test problem
exactly runs into the issue that the exact two-sided p-value for OR==x
is not a nice function of x (it has discontinuities and may be
non-monotone).
Consider this case, and you'll see part of the point
fisher.test(matrix(c(3,0,0,3),2),alt="g")
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list