[R] Superposed histograms
Frank E Harrell Jr
fharrell at virginia.edu
Fri Jan 10 12:58:03 CET 2003
On Fri, 10 Jan 2003 10:41:31 +0000 (GMT)
Damon Wischik <djw1005 at cam.ac.uk> wrote:
>
> I woud like to plot cumulative histograms. Specifically,
> I have data like
> Sex M M F M F F M F
> Height 6 6.3 6.1 5.5 7.2 6.2 5.9 6.0 ....
> and I want to plot a histogram of the distribution of all heights,
> colouring the histogram bars according to sex, for example
>
> | o
> | oo o
> | o oo ** o o = observations of women
> | o o*o***o * = observations of men
> | *o*******
> |----------
>
> (And I want this in a Trellis plot, and with more than two groups of
> observations.) How should I do this? I tried looking for imaginitive
> combinations of panel.superpose and panel.histogram. I suppose if I called
> panel.histogram for the cumulative data first, then panel.histogram for
> just the data on men, with a different colour, I could achieve the effect.
> But I'd need to superpose the accumulated data, and panel.superpose seems
> to only separate the data by group, not accumulate data by group.
>
> Damon Wischik.
I don't think this will be effective from a graphical perception point of view. One problem is that the perception of the bottom symbols will be different than that of the symbols assigned to the upper region, because the upper symbols are not bottom-aligned. I suggest usual multi-panel histograms or back-to-back histograms (see e.g. histbackback in the Hmisc library). But better still would be superposed ECDFs (e.g., ecdf() in Hmisc or in Martin Maechler's package). ECDFs are much better for showing distribution differences in my view.
--
Frank E Harrell Jr Prof. of Biostatistics & Statistics
Div. of Biostatistics & Epidem. Dept. of Health Evaluation Sciences
U. Virginia School of Medicine http://hesweb1.med.virginia.edu/biostat
More information about the R-help
mailing list