[R] Approximating discrete distribution by continuous distribution

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Jan 22 13:45:07 CET 2013


On 22/01/2013 11:49, Michael Haenlein wrote:
> Dear all,
>
> I have a discrete distribution showing how age is distributed across a
> population using a certain set of bands:
>
> Age <- matrix(c(74045062, 71978405, 122718362, 40489415), ncol=1,
> dimnames=list(c("<18", "18-34", "35-64", "65+"),c()))
> Age_dist <- Age/sum(Age)
>
> For example I know that 23.94% of all people are between 0-18 years, 23.28%
> between 18-34 years and so forth.
>
> I would like to find a continuous approximation of this discrete
> distribution in order to estimate the probability that a person is for
> example 16 years old.
>
> Is there some automatic way in R through which this can be done? I tried a
> Kernel density estimation of the histogram but this does not seem to
> provide what I'm looking for.

This is not really an R question, but a statistics one.  It is almost 
guesswork: if for example these were drivers in the UK, the answer is 0. 
  So you need to supply some information about the shape of the 
distribution of <18 year olds.

You have estimates of the cumulative distribution function at c(0, 18, 
35, 65, Inf) (or some better upper limit).  You want to interpolate it. 
  You could use linear interpolation (approx[fun]) or a monotone spline 
interpolation (spline[fun]) or any other interpolation method which 
meets your needs.  But whatever you use, you will supplying a lot of 
information not actually in your data.

>
> Thanks very much for your help,
>
> Michael
>
>
> Michael Haenlein
> Associate Professor of Marketing
> ESCP Europe
> Paris, France
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list