[R] unexpected behavior of boxplot(x, notch=TRUE, log="y")
Ben Bolker
bolker at zoo.ufl.edu
Sun Oct 8 01:40:01 CEST 2006
bogdan romocea <br44114 <at> gmail.com> writes:
>
> A function I've been using for a while returned a surprising [to me,
> given the data] error recently:
> Error in plot.window(xlim, ylim, log, asp, ...) :
> Logarithmic axis must have positive limits
>
> After some digging I realized what was going on:
> x <- c(10460.97, 10808.67, 29499.98, 1, 35818.62, 48535.59, 1, 1,
> 42512.1, 1627.39, 1, 7571.06, 21479.69, 25, 1, 16143.85, 12736.96,
> 1, 7603.63, 1, 33155.24, 1, 1, 50, 3361.78, 1, 37781.84, 1, 1,
> 1, 46492.05, 22334.88, 1, 1)
> summary(x)
> boxplot(x,notch=TRUE,log="y") #unexpected
> boxplot(x) #ok
> boxplot(x,log="y") #ok
> boxplot(x,notch=TRUE) #aha
>
Mick Crawley (author of several books on ecological
data analysis in R) submitted a related issue as
bug #7690, which I was mildly surprised to see
filed as "not reproducible" (I didn't have problems reproducing
it at the time ... I posted my then-patch
to R-devel at the time
https://stat.ethz.ch/pipermail/r-devel/2006-January/036257.html )
The problem typically occurs
for very small data sets, when the notches can
be bigger than the hinges.
As I said then,
> I can imagine debate about what should be done in this case --
> you could just say "don't do that", since the notches are based
> on an asymptotic argument ... the diff below just truncates
> the notches to the hinges, but produces a warning saying that the
> notches have been truncated.
The interaction with
log="y" is new to me, though, and my old patch
didn't catch it.
Here's my reproducible version:
set.seed(1001)
npts <- 7
X <- rnorm(2*npts,rep(c(3,4.5),each=npts),sd=1)
f <- factor(rep(1:2,each=npts))
par(mfrow=c(1,2))
boxplot(X~f,notch=TRUE)
A possible fix is to truncate the notches
(and issue a warning) when this happens,
in src/library/grDevices/R/calc.R:
*** calc.R 2006-10-07 17:44:49.000000000 -0400
--- newcalc.R 2006-10-07 19:25:38.000000000 -0400
***************
*** 16,21 ****
--- 16,26 ----
if(any(out[nna])) stats[c(1, 5)] <- range(x[!out], na.rm = TRUE)
}
conf <- if(do.conf) stats[3] + c(-1.58, 1.58) * iqr / sqrt(n)
+ if (do.conf) {
+ if (conf[1]<stats[2] || conf[2]>stats[4]) warning("confidence limits >
hinges: notches truncated")
+ conf[1] <- max(conf[1],stats[2])
+ conf[2] <- min(conf[2],stats[4])
+ }
list(stats = stats, n = n, conf = conf,
out = if(do.out) x[out & nna] else numeric(0))
}
More information about the R-help
mailing list