[R] proportions confidence intervals

Rolf Turner rolf at math.unb.ca
Mon Jul 12 21:29:57 CEST 2004


There has been a plethora of responses over the past hour or so to a
question posed by Darren Shaw about how to estimate (get a confidence
interval for) a proportion based on a data set consisting of a number
of proportions.  These responses have been all off the point.  I
would suggest to the responders:

			RTFQ

The question was not about how to calculate a confidence interval for
a proportion.  Responders have gone on and on with academic wanking
about alternatives to the ``standard'' procedure, some of which give
better coverage properties (and some of which don't; so-called
``exact'' methods are notoriously bad).

The point of the question was how to combine the information from a
number of (sample) proportions.  If the structure and context are as
I conjectured in my posting then

	(a) this is simple, and

	(b) the combined sample size is almost surely large enough so
	that the simple and easy standard procedure will produce an
	eminently adequate result.  (Thus making the alternative
	approaches even more of an academic wank than they usually are.)

	I think at this point it is worthwhile repeating the
	quote posted a while back by Doug Bates.  (He attributed
	the quote to George Box, but was unable to supply a 
	citation; I wrote to Box asking him about the quote, and
	he said ``Nope.  'Twarn't me.'')  But irrespective of the
	source of the quote, the point it makes is valid:

	``You have a big approximation and a small approximation.  The
	big approximation is your approximation to the problem you
	want to solve.  The small approximation is involved in
	getting the solution to the approximate problem.''

That is to say there are ***many*** effects which will have an impact
on the proportion estimate required.  (Were the samples really random?
Were they really independent?  Were they really all taken from the
same population or populations with the same sample proportion?)  The
impact of such considerations causes the issue of the roughness of
the usual/standard approximate CI for a proportion to pale by
comparison.

				cheers,

					Rolf Turner
					rolf at math.unb.ca




More information about the R-help mailing list