[Rd] too-large notches in boxplot (PR #7690)
Martin Maechler
maechler at stat.math.ethz.ch
Tue Nov 7 11:35:54 CET 2006
>>>>> "Ben" == Ben Bolker <bolker at zoo.ufl.edu>
>>>>> on Mon, 23 Jan 2006 14:37:18 -0500 writes:
Ben> PR #7690 points out that if the confidence intervals (+/-1.58
Ben> IQR/sqrt(n)) in a boxplot with notch=TRUE are larger than the
Ben> hinges -- which is most likely to happen for small n and asymmetric
Ben> distributions -- the resulting plot is ugly, e.g.:
set.seed(1001)
npts <- 5
X <- rnorm(2*npts,rep(3:4,each=npts),sd=1)
f <- factor(rep(1:2,each=npts))
boxplot(X~f)
boxplot(X~f,notch=TRUE)
Ben> I can imagine debate about what should be done in this case --
Ben> you could just say "don't do that", since the notches are based
Ben> on an asymptotic argument ... the diff below just truncates
Ben> the notches to the hinges, but produces a warning saying that the
Ben> notches have been truncated.
Ben> ?? what should the behavior be ??
And this has been mentioned again more recently (than January!)
and IIRC I'd argued that the plotting behavior should not be changed,
because of back-compatibility and "you get what you deserve" etc
OTOH, users should at least notice that something "unusual"
happens,
and I have used part of Ben's proposed patch to simply issue a
warning when the notches go beyond the hinges i.e. out side the
"box" of the boxplot.
new>> Warning message:
new>> some notches went outside hinges ('box'): maybe set notch=FALSE
I hope that this helps all those who where puzzled
by examples like the one above.
Martin Maechler, ETH Zurich
with thanks to Ben for his perseverance (:-)
Ben> the diff is against the 11 Jan version of R 2.3.0
Ben> *** newboxplot.R 2006-01-23 14:32:12.000000000 -0500
Ben> --- oldboxplot.R 2006-01-23 14:29:29.000000000 -0500
Ben> ***************
Ben> *** 84,98 ****
Ben> bplt <- function(x, wid, stats, out, conf, notch, xlog, i)
Ben> {
Ben> ## Draw single box plot
Ben> - conf.ok <- TRUE
Ben> - if(!any(is.na(stats))) {
Ben> - ## check for overlap of notches and hinges
Ben> - if (notch && (stats[2]>conf[1] || stats[4]<conf[2])) {
Ben> - conf.ok <- FALSE
Ben> - conf[1] <- max(conf[1],stats[2])
Ben> - conf[2] <- min(conf[2],stats[4])
Ben> - }
Ben> ## stats = +/- Inf: polygon & segments should handle
Ben> ## Compute 'x + w' -- "correctly" in log-coord. case:
Ben> --- 84,91 ----
Ben> bplt <- function(x, wid, stats, out, conf, notch, xlog, i)
Ben> {
Ben> ## Draw single box plot
Ben> + if(!any(is.na(stats))) {
Ben> ## stats = +/- Inf: polygon & segments should handle
Ben> ## Compute 'x + w' -- "correctly" in log-coord. case:
Ben> ***************
Ben> *** 148,154 ****
Ben> domain = NA)
Ben> }
Ben> }
Ben> - return(conf.ok)
Ben> } ## bplt
Ben> if(!is.list(z) || 0 == (n <- length(z$n)))
Ben> --- 141,146 ----
Ben> ***************
Ben> *** 239,252 ****
Ben> xysegments <- segments
Ben> }
Ben> - conf.ok <- numeric(n)
Ben> for(i in 1:n)
Ben> ! conf.ok[i] <- bplt(at[i], wid=width[i],
Ben> stats= z$stats[,i],
Ben> out = z$out[z$group==i],
Ben> conf = z$conf[,i],
Ben> notch= notch, xlog = xlog, i = i)
Ben> ! if (any(!conf.ok)) warning("some confidence limits > hinges:
Ben> notches truncated")
Ben> axes <- is.null(pars$axes)
Ben> if(!axes) { axes <- pars$axes; pars$axes <- NULL }
Ben> if(axes) {
Ben> --- 231,243 ----
Ben> xysegments <- segments
Ben> }
Ben> for(i in 1:n)
Ben> ! bplt(at[i], wid=width[i],
Ben> stats= z$stats[,i],
Ben> out = z$out[z$group==i],
Ben> conf = z$conf[,i],
Ben> notch= notch, xlog = xlog, i = i)
Ben> !
Ben> axes <- is.null(pars$axes)
Ben> if(!axes) { axes <- pars$axes; pars$axes <- NULL }
Ben> if(axes) {
Ben> --
Ben> 620B Bartram Hall bolker at zoo.ufl.edu
Ben> Zoology Department, University of Florida http://www.zoo.ufl.edu/bolker
Ben> Box 118525 (ph) 352-392-5697
Ben> Gainesville, FL 32611-8525 (fax) 352-392-3704
Ben> ______________________________________________
Ben> R-devel at r-project.org mailing list
Ben> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list