[R] Off topic --- underdispersed (pseudo) binomial data.

Thu Mar 25 02:32:48 CET 2021

I would like a real-life example of a data set which one might think to
model by a binomial distribution, but which is substantially
underdispersed. I.e. a sample X = {X_1, X_2, ..., X_N} where each X_i
is an integer between 0 and n (n known a priori) such that var(X) <<
mean(X)*(1 - mean(X)/n).

Does anyone know of any such examples?  Do any exist?  I've done
a perfunctory web search, and had a look at "A Handbook of Small
Data Sets" by Hand, Daly, Lunn, et al., and drawn a blank.

I've seen on the web some references to underdispersed "pseudo-Poisson"
data, but not to underdispersed "pseudo-binomial" data.  And of course
there's lots of *over* dispersed stuff.  But that's not what I want.

I can *simulate* data sets of the sor that I am looking for (so far the
only ideas I've had for doing this are pretty simplistic and
artificial) but I'd like to get my hands on a *real* example, if
possible.

Grateful for any pointers/suggestions.

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276