[Rd] Modified Histogram functions

Kjetil Kjernsmo kjetil.kjernsmo@astro.uio.no
Sun, 9 Jul 2000 21:34:24 +0200 (MET DST)


Dear all,

I have done further modifications on the histogram functions that I
reported earlier this year, and I hope this can be of use and perhaps
included in the distribution. I have been using this stuff a couple of
months myself, and while it is nothing sophisticated, it has it's
applications. :-) I did a few small modifications today to make it a bit
more compact.

I have modified the hist.default() function so that the list it returns
has class "histogram". I have put the plotting in a separate function,
plot.histogram(), which is called from hist.default if plot=T, and I have
made a lines.histogram() (calling it "lines", is that a wise choice? It
seemed most natural, but then again....), which can be used to overlay
histograms on existing histograms. Here it is obviously a lot of room for
improvement. E.g. calling plot.histogram() with multiple histogram
objects could produce a plot where coloumns representing the same values
are grouped, I haven't addressed that. The code is below my .sig.

As you can see, there are few changes from your code. The biggest change
is that hist.default() calls plot.histogram() (or rather plot()). Also
note that the warning about wrong areas only appears in hist.default(). I
figured that the user is most likely created the histogram object with
hist, so s/he might no want to be bothered with this warning again. I
don't know if this is wise, s/he might just have forgotten or something. :-)

I guess plot.histogram() may also be made more compact by calling
lines.histogram() to plot the actual rects, but I didn't find a practical
way of doing this, so I didn't... :-) And, contrary to
the other functions making histogram plots, lines.histogram() does take
lty as an argument (useful for making dotted borders, to separate one
histogram from another, particulary useful when using colors).

I've got a function for adding two similar histograms as well, but that's
so hacked up I guess it isn't worth much.

BTW, I found a typo in the help for rect:
     xpd: logical (``expand''); if `FLASE', everything is clipped to
          the plot region.

Hope this is useful for someone... :-)

Best,

Kjetil
-- 
Kjetil Kjernsmo
Graduate astronomy-student                    Problems worthy of attack
University of Oslo, Norway            Prove their worth by hitting back
E-mail: kjetikj@astro.uio.no                                - Piet Hein
Homepage <URL:http://www.astro.uio.no/~kjetikj/>
Webmaster@skepsis.no 

hist <- function(x, ...) UseMethod("hist")

hist.default <-
function (x, breaks, freq= NULL, probability = !freq, include.lowest= TRUE,
       right=TRUE, col = NULL, border = par("fg"),
       main = paste("Histogram of" , deparse(substitute(x))),
       xlim = range(breaks), ylim = range(y, 0),
       xlab = deparse(substitute(x)), ylab,
       axes = TRUE, plot = TRUE, labels = FALSE, nclass = NULL, ...)
{
  if (!is.numeric(x))
    stop("hist: x must be numeric")
  main # eval() now: defeat lazy eval
  xlab
  n <- length(x <- x[!is.na(x)])
  use.br <- !missing(breaks)
  if(use.br) {
    if(!missing(nclass))
      warning("`nclass' not used when `breaks' specified")
  }
  else if(!is.null(nclass) && length(nclass) == 1)
    breaks <- nclass
  use.br <- use.br && (nB <- length(breaks)) > 1
  if(use.br)
    breaks <- sort(breaks)
  else { # construct vector of breaks
    rx <- range(x)
    nnb <-
    if(missing(breaks)) 1 + log2(n)
    else { # breaks = `nclass'
      if (is.na(breaks) | breaks < 2)
        stop("invalid number of breaks")
      breaks
    }
    breaks <- pretty (rx, n = nnb, min.n=1, eps.corr = 2)
    nB <- length(breaks)
    if(nB <= 1) { ##-- Impossible !
      stop(paste("hist.default: pretty() error, breaks=",format(breaks)))
    }
  }
  storage.mode(x) <- "double"
  storage.mode(breaks) <- "double"
  counts <- .C("bincount",
               x,
               n,
               breaks,
               nB,
               counts = integer(nB - 1),
               right = as.logical(right),
               include= as.logical(include.lowest),
               NAOK = FALSE, DUP = FALSE, PACKAGE = "base") $counts
  if (any(counts < 0))
  stop("negative `counts'. Internal Error in C-code for \"bincount\"")
  if (sum(counts) < n)
  stop("some `x' not counted; maybe `breaks' do not span range of `x'")
  h <- diff(breaks)
  if (!use.br && any(h <= 0))
  stop("not strictly increasing `breaks'.")
  if (is.null(freq)) {
    freq <- if(!missing(probability))
    !as.logical(probability)
    else if(use.br) {
      ##-- Do frequencies if breaks are evenly spaced
      max(h)-min(h) < 1e-7 * mean(h)
    } else TRUE
  } else if(!missing(probability) && any(probability == freq))
    stop("`probability is an alias for `!freq', however they differ.")
  intensities <- counts/(n*h)
  mids <- 0.5 * (breaks[-1] + breaks[-nB])
  y <- if (freq) counts else intensities
  r <- structure(list(breaks = breaks, counts = counts,
                      intensities = intensities, mids = mids),
                 class="histogram")
  if (plot) {
    if(freq && use.br && max(h)-min(h) > 1e-7 * mean(h))
    warning("the AREAS in the plot are wrong -- maybe use `freq=FALSE'")
    plot(r, freq = freq, probability = !freq, col = col, border = border,
         main = main, xlim = xlim, ylim = ylim, xlab = xlab, ylab = ylab,
         axes = axes, labels = labels, ...)
    invisible(r)
  }
  else r
}

plot.histogram <- function (x, freq= NULL, probability = !freq, 
                            col = NULL, border = par("fg"),
                            main = paste("Histogram of" ,
                              deparse(substitute(x))),
                            xlim = range(x$breaks), ylim = range(y, 0),
                            xlab = deparse(substitute(x)), ylab,
                            axes = TRUE, labels = FALSE,  ...)
{
  if (is.null(freq)) {
    freq <- if(!missing(probability))
      !as.logical(probability) else TRUE
  } else if(!missing(probability) && any(probability == freq))
  stop("`probability' is an alias for `!freq', however they differ.")
  y <- if (freq) x$counts else x$intensities
  nB <- length(x$breaks)
  plot.new()
  plot.window(xlim, ylim, "") #-> ylim's default from 'y'
  if (missing(ylab))
    ylab <- paste(if(!freq)"Relative ", "Frequency", sep="")
  title(main = main, xlab = xlab, ylab = ylab, ...)
  if(axes) {
    axis(1, ...)
    axis(2, ...)
  }
  rect(x$breaks[-nB], 0, x$breaks[-1], y,
       col = col, border = border)
  if(labels)
    text(x$mids, y,
         labels = if(freq) x$counts else round(x$intensities,3),
         adj = c(0.5, -0.5))
  invisible()
}

lines.histogram <- function(x, freq=NULL, probability=!freq,
                            col=NULL, border=par("fg"), lty=NULL...)
{
  if (is.null(freq)) {
    freq <- if(!missing(probability))
      !as.logical(probability) else TRUE
  } else if(!missing(probability) && any(probability == freq))
    stop("`probability' is an alias for `!freq', however they differ.")
  y <- if (freq) x$counts else x$intensities
  nB <- length(x$breaks)
  rect(x$breaks[-nB], 0, x$breaks[-1], y,
       col = col, border = border, lty = lty)
  invisible()
}
 

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._