[R] Cumulative lattice histograms

Richard.Cotton at hsl.gov.uk Richard.Cotton at hsl.gov.uk
Mon May 12 11:22:59 CEST 2008


> It's fairly straightforward to plot cumulative histograms using the 
hist()
> function. You do something like:
> 
> h <- hist(rnorm(100), plot=FALSE)
> h$counts<- cumsum(h$counts)
> plot(h)
> 
> However, I have failed to find any example where this is done using the
> lattice histogram() function. I realize I need to slightly alter the 
panel
> function panel.histogram. Specifially I would like to add the following 
line
> in red, just like I did above:
> 
> function (x, breaks, equal.widths = TRUE, type = "density", nint =
> round(log2(length(x)) +
>     1), alpha = plot.polygon$alpha, col = plot.polygon$col, border =
> plot.polygon$border,
>     lty = plot.polygon$lty, lwd = plot.polygon$lwd, ...)
> {
>     plot.polygon <- trellis.par.get("plot.polygon")
>     xscale <- current.panel.limits()$xlim
>     panel.lines(x = xscale[1] + diff(xscale) * c(0.05, 0.95),
>         y = c(0, 0), col = border, lty = lty, lwd = lwd, alpha = alpha)
>     if (length(x) > 0) {
>         if (is.null(breaks)) {
>             breaks <- if (is.factor(x))
>                 seq_len(1 + nlevels(x)) - 0.5
>             else if (equal.widths)
>                 do.breaks(range(x, finite = TRUE), nint)
>             else quantile(x, 0:nint/nint, na.rm = TRUE)
>         }
>         h <- hist.constructor(x, breaks = breaks, ...)
> 
>         h$counts<- cumsum(h$counts)
> 
>         y <- if (type == "count")
>             h$counts
>         else if (type == "percent")
>             100 * h$counts/length(x)
>         else h$intensities
>         breaks <- h$breaks
>         nb <- length(breaks)
>         if (length(y) != nb - 1)
>             warning("problem with 'hist' computations")
>         if (nb > 1) {
>             panel.rect(x = breaks[-nb], y = 0, height = y, width =
> diff(breaks),
>                 col = col, alpha = alpha, border = border, lty = lty,
>                 lwd = lwd, just = c("left", "bottom"))
>         }
>     }
> }

You are nearly there!  You just need to:

1. Specify the package of hist.constructor so R's search mechanism finds 
it.
h <- lattice:::hist.constructor(x, breaks = breaks, ...)

2. Give the new function a name.
panel.cumul.histogram <- function (x, breaks, etc.

3. Specify the function to use for the panel in your call to histogram.
histogram(<some params>, panel=panel.cumul.histogram)

4. You'll also need to play about with the y axis limits.
histogram(<some params>, ylim=c(0, something))

Here's an example with the singer data:

panel.cumul.histogram <- function (x, breaks, equal.widths = TRUE, type = 
"density", nint =
round(log2(length(x)) +
    1), alpha = plot.polygon$alpha, col = plot.polygon$col, border =
plot.polygon$border,
    lty = plot.polygon$lty, lwd = plot.polygon$lwd, ...)
{
    plot.polygon <- trellis.par.get("plot.polygon")
    xscale <- current.panel.limits()$xlim
    panel.lines(x = xscale[1] + diff(xscale) * c(0.05, 0.95),
        y = c(0, 0), col = border, lty = lty, lwd = lwd, alpha = alpha)
    if (length(x) > 0) {
        if (is.null(breaks)) {
            breaks <- if (is.factor(x))
                seq_len(1 + nlevels(x)) - 0.5
            else if (equal.widths)
                do.breaks(range(x, finite = TRUE), nint)
            else quantile(x, 0:nint/nint, na.rm = TRUE)
        }
        h <- lattice:::hist.constructor(x, breaks = breaks, ...)

        h$counts<- cumsum(h$counts)

        y <- if (type == "count")
            h$counts
        else if (type == "percent")
            100 * h$counts/length(x)
        else h$intensities
        breaks <- h$breaks
        nb <- length(breaks)
        if (length(y) != nb - 1)
            warning("problem with 'hist' computations")
        if (nb > 1) {
            panel.rect(x = breaks[-nb], y = 0, height = y, width =
diff(breaks),
                col = col, alpha = alpha, border = border, lty = lty,
                lwd = lwd, just = c("left", "bottom"))
        }
    }
}

histogram( ~ height | voice.part, 
    data = singer, 
    nint = 17,
    endpoints = c(59.5, 76.5), 
    layout = c(2,4), 
    aspect = 1,
    xlab = "Height (inches)",
    panel=panel.cumul.histogram, 
    type="count", 
    ylim=c(0,max(summary(singer$voice.part)+1)))

Having said all this, I'm not entirely convinced that a cumulative 
histogram is as useful as your standard issue empirical cumulative density 
function.

Regards,
Richie.

Mathematical Sciences Unit
HSL



------------------------------------------------------------------------
ATTENTION:

This message contains privileged and confidential inform...{{dropped:20}}



More information about the R-help mailing list