[R] Debugging R's code: boxplot.stats

Matthew Walker m.g.walker at massey.ac.nz
Mon Oct 30 23:30:10 CET 2006


Hi Martin,

Sorry, I did intend to put some examples in but must have forgotten.
This example is easy to understand:

boxplot(c(1,Inf,Inf,Inf,Inf))

has lower bound: 1, lower quartile: Inf, median: Inf, upper quartile:
Inf, upper bound: Inf.
The command currently errors with the obscure message
"Error in if (any(out[nna])) stats[c(1, 5)] <- range(x[!out], na.rm =
TRUE) : missing value where TRUE/FALSE needed"
In this case I think boxplot should draw a (lower) whisker at 1 and
nothing else.  If my alteration is made, boxplot draws just that.

The argument against this change is that the "range" argument
(see ?boxplot) isn't able to be considered (but what is the value of the
inter-quartile range anyway? Inf - Inf = NaN.)

The smallest example to demonstrate this issue is:

boxplot(c(1,Inf,Inf))

I hope that's helpful!

Cheers,

Matthew

On Mon, 2006-10-30 at 15:53 +0100, Martin Maechler wrote:
> Hi Matthew,
> 
> I'm considering to apply your proposed change to the R sources
> of boxplot.stats().
> 
> However, I'd like to see "interesting" examples where the new
> and old version exhibit differences.
> Since you've now spent so much time on this,
> you will have a reproducible-code small example, will you?
> 
> Thanks in advance!
> Martin 
> 
> Martin <Maechler at stat.math.ethz.ch>  http://stat.ethz.ch/people/maechler
> Seminar für Statistik, ETH Zürich  LEO C16	Leonhardstr. 27
> CH-8092 Zurich, SWITZERLAND
> phone: +41-44-632-3408       fax: ...-1228      <><
> 
> 
> >>>>> "Matthew" == Matthew Walker <m.g.walker at massey.ac.nz>
> >>>>>     on Mon, 30 Oct 2006 19:32:25 +1300 writes:
> 
>     Matthew> On Sun, 2006-10-29 at 20:18 -0500, Duncan Murdoch
>     Matthew> wrote:
>     >> If you're sure your change is a good idea then post a
>     >> patch here along with an explanation of why it's so good:
>     >> and it might make it into the next release.
> 
>     Matthew> Thank you to both Duncan and Gabor, your help was
>     Matthew> really appreciated.
> 
>     Matthew> My 10 character alteration did what I hoped it
>     Matthew> would.  So I'd like to offer it to you or whoever
>     Matthew> else might be interested.
> 
>     Matthew> Boxplot does it's job well, it even mostly "works"
>     Matthew> with infinite values by not plotting certain lines.
>     Matthew> For example, if the upper bound is infinite, the
>     Matthew> upper whisker isn't plotted.
> 
>     Matthew> However boxplot doesn't work if the upper bound,
>     Matthew> upper quartile, median, and lower quartile are all
>     Matthew> infinite.  Although there is sufficient data to
>     Matthew> plot a lower bound, boxplot.stats errors instead.
>     Matthew> This error also occurs when the lower bound through
>     Matthew> to the upper quartile are negative infinity.
> 
>     Matthew> This is not tricky to fix.  All that needs to
>     Matthew> change is line 14 of boxplot.stats.  It currently
>     Matthew> reads: "if (any(out[nna]))" Changing it to: "if
>     Matthew> (any(out[nna],na.rm=TRUE))" fixes these issues.
> 
>     Matthew> Cheers,
> 
>     Matthew> Matthew
> 
>     Matthew> ______________________________________________
>     Matthew> R-help at stat.math.ethz.ch mailing list
>     Matthew> https://stat.ethz.ch/mailman/listinfo/r-help PLEASE
>     Matthew> do read the posting guide
>     Matthew> http://www.R-project.org/posting-guide.html and
>     Matthew> provide commented, minimal, self-contained,
>     Matthew> reproducible code.
>



More information about the R-help mailing list