[R] estimating quantiles from binned data

Russell Senior seniorr at aracnet.com
Sun Sep 14 22:13:42 CEST 2003


>>>>> "Spencer" == Spencer Graves <spencer.graves at pdf.com> writes:

Russell> Suppose I have a set of binned data, counts exceeding a
Russell> series of arbitrary thresholds, a total N, a minimum and
Russell> maximum, those sorts of things.  Is there a "standard" method
Russell> for estimating arbitrary quantiles from this?  My initial
Russell> thought is that the counts and min/max give me solutions at
Russell> various points along the empirical cdf.  As the data are
Russell> roughly log-normal, I thought maybe I could use piece-wise
Russell> log-normal distributions between these points to estimate the
Russell> arbitrary quantiles I am interested in.  Are there "better
Russell> thought out" methods than this?  Thanks!

Spencer> Have you considered making a normal probability plot?  

This is probably not practical, given I have on the order of 7000 sets
of binned data to evaluate.  I have prior knowledge of the data
involved (i.e. when we have an actual sample rather than just bin
counts), and though it isn't perfect, log normal usually isn't too bad
particularly in comparison to a standard normal distribution.  I also
want to match at the points where we "know" the quantiles precisely
(i.e. at bin boundaries).

Spencer> The image of a mixture of lognormals would suggest limits on
Spencer> the accuracy of such interpolation.

Oh, there are "limits", no doubt.  I guess the main point of my query
is to evaluate whether there are better, more theoretically sound
methods than the one I extracted from my hat.

-- 
Russell Senior         ``shtal latta wos ba padre u prett tu nashtonfi
seniorr at aracnet.com      mrlosh''  -- Bashgali Kafir for ``If you have
                         had diarrhoea many days you will surely die.''




More information about the R-help mailing list