[R] Mid-P value for a chi-squared test

Sun Jun 13 20:19:49 CEST 2010

On 13-Jun-10 17:12:45, David Winsemius wrote:
> On Jun 1, 2010, at 4:17 AM, Wilson, Andrew wrote:
> 
>> Can anyone tell me how to calculate a mid-p value for a chi-squared  
>> test in R?
> 
> I cannot see that this has been answered. It has a date from 12 days  
> ago but I cannot see a reply in the archives.
> 
> So, what is a "mid-p value" and which "chi-square test" are you asking 
> about? A simple data setup in R code with expected output  would speed 
> this discussion along.
> 
> David Winsemius, MD

The "mid-p value" is a device for improving the accuracy of a continuous
approximation to a distribution which in reality is discrete.

Intuitively, the idea is to treat the discrete probabilities of the
discrete distribution as if they were proportions in a histogram.
Then imagine fitting a continuous curve (e.g. a chi-squared density)
to the histogram. The fit (agreement between the proportion in one
histogram bar, and the probability below that portion of the curve
which lies in the same range) will be better if the curve goes through
the midpoint of the top of the bar.

This leads to the formal definition:

  "mid-P" = Prob(X > Xobs) + 0.5*Prob(X = Xobs)

A number of R functions use this idea. Check out what you get by
going to http://finzi.psych.upenn.edu/nmz.html and entering "mid-p"
into the search box, and see whether any of them match (or come
close to) your particular case.

In the case of the chi-squared test, the idea is related to (but
not the same as) the "Yates correction for continuity". chisq.test()
has an option "correct=TRUE" to force this, but only for 2x2 tables.

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 13-Jun-10                                       Time: 19:19:44
------------------------------ XFMail ------------------------------