[R] Plotting coloured histograms...

Damon Wischik djw1005 at cam.ac.uk
Mon Jan 27 00:23:02 CET 2003

> Hi, I am having some trouble trying to plot a histogram in more than one
> colour. What I want to do is, plot two vectors in the same histogram, but
> with different colours, for instance:
>        > x <- rnorm(1000,20,4);
>        > y <- rnorm(1000,10,2);
>     Then I'd like to have x and y ploted on the same hist (I can do that
> already doing w <- c(x,y) then hist(w)) but the bars representing the x's should
> be in one colour and the bars representing the y should be in another one,
> so that I could see the overlaping areas of the two distributions etc.

You haven't made it clear if you want the histograms on top of each other
(which is what you get from w<-c(x,y) ) or if you want them side-by-side
(so you can see overlapping areas).

I plot histograms on top of each other using my own trellis function
panel.surpose, like this:

x <- rnorm(1000,20,4)
y <- rnorm(1000,10,2)
w <- c(x,y)
f <- ifelse(1:length(z)<=length(x),0,1)
histogram(~w, groups=f, panel=panel.surpose)

The "groups" argument should be a factor, or some other variable taking
values in a small set. The function panel.surpose takes each panel in
turn, splits the dataset for that panel into groups according to "groups",
and plots a histogram for each group, one on top of another. The
colours can be specified like "bar.col=c(rgb(1,0,0),rgb(0,1,0))", the
first colour being used for the lowest level of "groups". You could
specify a different "panel.groups" function, but I can't see how that
would be useful.

In writing this function, I had trouble with the standard routines
hist and panel.histogram, which deal with the counting. What should
their behaviour be when the data has NA values? It seems to me that
count, percentage and density do not handle NA values in a consistent way.
Anyway, I wrote a replacement for panel.histogram to use with my

Damon Wischik.

panel.histogram.partial <- 
function (x, breaks, equal.widths = TRUE, type = "density", col =
bar.fill$col, ...) 
    x <- as.numeric(x)
    grid.lines(x = c(0.05, 0.95), y = unit(c(0, 0), "native"), 
        default.units = "npc")
    if (length(x) > 0) {
        bar.fill <- trellis.par.get("bar.fill")
        if (is.null(breaks)) {
            nint <- round(log2(length(x)) + 1)
            breaks <- if (equal.widths) 
                do.breaks(range(x), nint)
            else quantile(x, 0:nint/nint)
        h <- hist(x, breaks = breaks, plot = FALSE, ...)
        y <- if (type == "count") 
        else if (type == "percent") 
            100 * h$counts/length(x)
        else h$intensities * length(which(!is.na(x)))/length(x)
        nb <- length(breaks)
        if (nb != (length(y) + 1)) 
            warning("something is probably wrong")
        if (nb > 1) {
            for (i in 1:(nb - 1)) if (y[i] > 0) {
                grid.rect(gp = gpar(fill = col), x = breaks[i], 
                  y = 0, height = y[i], width = breaks[i + 1] - 
                    breaks[i], just = c("left", "bottom"), default.units="native")

rgbmix <- function(p,col1,col2) { # p a vector. pi=0: col1. pi=1: col2
  cm <- (1-p) %o% col2rgb(col1) + p %o% col2rgb(col2)

panel.surpose <-
function (x, y = NULL, subscripts, groups, panel.groups =
  bar.col=superpose.line$col, ...)
  x <- as.numeric(x)
  if (!is.null(y)) y <- as.numeric(y)
  superpose.line <- trellis.par.get("superpose.line")
  if (length(subscripts)>0 && length(groups)<max(subscripts))
    groups <- rep(groups, length=max(subscripts))
  vals <- sort(unique(groups))
  nvals <- length(vals)
  bar.col <- if (length(bar.col)>=nvals) bar.col else {
    if (length(bar.col)>2 || length(bar.col)==1) 
    else # linearly interpolate the colours
  panel.groups <- if(is.function(panel.groups)) 
  else if (is.character(panel.groups))
  nx <- length(subscripts)
  for (i in nvals:1) {
    id <- (groups[subscripts] %in% vals[1:i])
    if (any(id)) {
      fakex <- rep(NA,times=nx-length(which(id)))
      xandfake <- c(fakex,x[id])
      args <- if (!is.null(y))
        list(x=xandfake, col=bar.col[i], lty=0, ...)

More information about the R-help mailing list