fisher.test {stats}  R Documentation 
Performs Fisher's exact test for testing the null of independence of rows and columns in a contingency table with fixed marginals.
fisher.test(x, y = NULL, workspace = 200000, hybrid = FALSE, hybridPars = c(expect = 5, percent = 80, Emin = 1), control = list(), or = 1, alternative = "two.sided", conf.int = TRUE, conf.level = 0.95, simulate.p.value = FALSE, B = 2000)
x 
either a twodimensional contingency table in matrix form, or a factor object. 
y 
a factor object; ignored if 
workspace 
an integer specifying the size of the workspace
used in the network algorithm. In units of 4 bytes. Only used for
nonsimulated pvalues larger than 2 by 2 tables.
Since R version 3.5.0, this also increases the internal stack size
which allows larger problems to be solved, however sometimes needing
hours. In such cases, 
hybrid 
a logical. Only used for larger than 2 by 2 tables, in which cases it indicates whether the exact probabilities (default) or a hybrid approximation thereof should be computed. 
hybridPars 
a numeric vector of length 3, by default describing “Cochran's conditions” for the validity of the chisquare approximation, see ‘Details’. 
control 
a list with named components for low level algorithm
control. At present the only one used is 
or 
the hypothesized odds ratio. Only used in the 2 by 2 case. 
alternative 
indicates the alternative hypothesis and must be
one of 
conf.int 
logical indicating if a confidence interval for the odds ratio in a 2 by 2 table should be computed (and returned). 
conf.level 
confidence level for the returned confidence
interval. Only used in the 2 by 2 case and if

simulate.p.value 
a logical indicating whether to compute pvalues by Monte Carlo simulation, in larger than 2 by 2 tables. 
B 
an integer specifying the number of replicates used in the Monte Carlo test. 
If x
is a matrix, it is taken as a twodimensional contingency
table, and hence its entries should be nonnegative integers.
Otherwise, both x
and y
must be vectors of the same
length. Incomplete cases are removed, the vectors are coerced into
factor objects, and the contingency table is computed from these.
For 2 by 2 cases, pvalues are obtained directly using the (central or noncentral) hypergeometric distribution. Otherwise, computations are based on a C version of the FORTRAN subroutine FEXACT which implements the network developed by Mehta and Patel (1983, 1986) and improved by Clarkson, Fan and Joe (1993). The FORTRAN code can be obtained from https://www.netlib.org/toms/643. Note this fails (with an error message) when the entries of the table are too large. (It transposes the table if necessary so it has no more rows than columns. One constraint is that the product of the row marginals be less than 2^31  1.)
For 2 by 2 tables, the null of conditional
independence is equivalent to the hypothesis that the odds ratio
equals one. ‘Exact’ inference can be based on observing that in
general, given all marginal totals fixed, the first element of the
contingency table has a noncentral hypergeometric distribution with
noncentrality parameter given by the odds ratio (Fisher, 1935). The
alternative for a onesided test is based on the odds ratio, so
alternative = "greater"
is a test of the odds ratio being bigger
than or
.
Twosided tests are based on the probabilities of the tables, and take as ‘more extreme’ all tables with probabilities less than or equal to that of the observed table, the pvalue being the sum of such probabilities.
For larger than 2 by 2 tables and hybrid = TRUE
,
asymptotic chisquared probabilities are only used if the
‘Cochran conditions’ (or modified version thereof) specified by
hybridPars = c(expect = 5, percent = 80, Emin = 1)
are
satisfied, that is if no cell has expected counts less than
1
(= Emin
) and more than 80% (= percent
) of the
cells have expected counts at least 5 (= expect
), otherwise
the exact calculation is used. A corresponding if()
decision
is made for all subtables considered.
Accidentally, R has used 180
instead of 80
as
percent
, i.e., hybridPars[2]
in R versions between
3.0.0 and 3.4.1 (inclusive), i.e., the 2nd of the hybridPars
(all of which used to be hardcoded previous to R 3.5.0).
Consequently, in these versions of R, hybrid=TRUE
never made a
difference.
In the r x c case with r > 2 or c > 2,
internal tables can get too large for the exact test in which case an
error is signalled. Apart from increasing workspace
sufficiently, which then may lead to very long running times, using
simulate.p.value = TRUE
may then often be sufficient and hence
advisable.
Simulation is done conditional on the row and column marginals, and works only if the marginals are strictly positive. (A C translation of the algorithm of Patefield (1981) is used.)
A list with class "htest"
containing the following components:
p.value 
the pvalue of the test. 
conf.int 
a confidence interval for the odds ratio.
Only present in the 2 by 2 case and if argument

estimate 
an estimate of the odds ratio. Note that the conditional Maximum Likelihood Estimate (MLE) rather than the unconditional MLE (the sample odds ratio) is used. Only present in the 2 by 2 case. 
null.value 
the odds ratio under the null, 
alternative 
a character string describing the alternative hypothesis. 
method 
the character string

data.name 
a character string giving the names of the data. 
Agresti, A. (1990). Categorical data analysis. New York: Wiley. Pages 59–66.
Agresti, A. (2002). Categorical data analysis. Second edition. New York: Wiley. Pages 91–101.
Fisher, R. A. (1935). The logic of inductive inference. Journal of the Royal Statistical Society Series A, 98, 39–54. doi: 10.2307/2342435.
Fisher, R. A. (1962). Confidence limits for a crossproduct ratio. Australian Journal of Statistics, 4, 41. doi: 10.1111/j.1467842X.1962.tb00285.x.
Fisher, R. A. (1970). Statistical Methods for Research Workers. Oliver & Boyd.
Mehta, Cyrus R. and Patel, Nitin R. (1983). A network algorithm for performing Fisher's exact test in r x c contingency tables. Journal of the American Statistical Association, 78, 427–434. doi: 10.1080/01621459.1983.10477989.
Mehta, C. R. and Patel, N. R. (1986). Algorithm 643: FEXACT, a FORTRAN subroutine for Fisher's exact test on unordered r x c contingency tables. ACM Transactions on Mathematical Software, 12, 154–161. doi: 10.1145/6497.214326.
Clarkson, D. B., Fan, Y. and Joe, H. (1993) A Remark on Algorithm 643: FEXACT: An Algorithm for Performing Fisher's Exact Test in r x c Contingency Tables. ACM Transactions on Mathematical Software, 19, 484–488. doi: 10.1145/168173.168412.
Patefield, W. M. (1981). Algorithm AS 159: An efficient method of generating r x c tables with given row and column totals. Applied Statistics, 30, 91–97. doi: 10.2307/2346669.
fisher.exact
in package exact2x2 for alternative
interpretations of twosided tests and confidence intervals for
2 by 2 tables.
## Agresti (1990, p. 61f; 2002, p. 91) Fisher's Tea Drinker ## A British woman claimed to be able to distinguish whether milk or ## tea was added to the cup first. To test, she was given 8 cups of ## tea, in four of which milk was added first. The null hypothesis ## is that there is no association between the true order of pouring ## and the woman's guess, the alternative that there is a positive ## association (that the odds ratio is greater than 1). TeaTasting < matrix(c(3, 1, 1, 3), nrow = 2, dimnames = list(Guess = c("Milk", "Tea"), Truth = c("Milk", "Tea"))) fisher.test(TeaTasting, alternative = "greater") ## => p = 0.2429, association could not be established ## Fisher (1962, 1970), Criminal convictions of likesex twins Convictions < matrix(c(2, 10, 15, 3), nrow = 2, dimnames = list(c("Dizygotic", "Monozygotic"), c("Convicted", "Not convicted"))) Convictions fisher.test(Convictions, alternative = "less") fisher.test(Convictions, conf.int = FALSE) fisher.test(Convictions, conf.level = 0.95)$conf.int fisher.test(Convictions, conf.level = 0.99)$conf.int ## A r x c table Agresti (2002, p. 57) Job Satisfaction Job < matrix(c(1,2,1,0, 3,3,6,1, 10,10,14,9, 6,7,12,11), 4, 4, dimnames = list(income = c("< 15k", "1525k", "2540k", "> 40k"), satisfaction = c("VeryD", "LittleD", "ModerateS", "VeryS"))) fisher.test(Job) # 0.7827 fisher.test(Job, simulate.p.value = TRUE, B = 1e5) # also close to 0.78 ## 6th example in Mehta & Patel's JASA paper MP6 < rbind( c(1,2,2,1,1,0,1), c(2,0,0,2,3,0,0), c(0,1,1,1,2,7,3), c(1,1,2,0,0,0,1), c(0,1,1,1,1,0,0)) fisher.test(MP6) # Exactly the same pvalue, as Cochran's conditions are never met: fisher.test(MP6, hybrid=TRUE)