[Rd] Quartile summary generated by density() is misleading (PR#11541)

ierickson at starmine.com ierickson at starmine.com
Fri May 30 18:15:09 CEST 2008


Full_Name: Ian Erickson
Version: 2.5.1 (2007-06-27)
OS: x86_64-redhat-linux-gnu
Submission from: (NULL) (204.16.153.138)


The quartile breaks reported by the density() function should intuitively be
cumulative density quartiles for the distribution being estimated. However, what
is calculated is instead simple quartiles for points used to plot the generated
curve.

Example: running density(rnorm(100000)) gives a 1st quartile of -2.2, and a 3rd
quartile of +2.2. However, graphing the density using
plot(density(rnorm(100000))) shows what would be expected - at -2.2, the
cumulative density is only a few percent rather than 25%.

Let me know if you have questions - the current calculated quartile numbers are
trivially just the range of the data divided by 4.

Thanks.



More information about the R-devel mailing list