[R] mild and extreme outliers in boxplot

Gavin Simpson gavin.simpson at ucl.ac.uk
Thu Aug 20 09:23:51 CEST 2009


Hi Rolf,

On Thu, 2009-08-20 at 12:31 +1200, Rolf Turner wrote:
> 
> I despair.  Why do you keep insisting that black is white?
> The OP wanted to be able to specify an argument to boxplot()
> that would cause it to plot mild and extreme outliers with
> different symbols.

Then you and I were reading different emails Rolf. I quote:
[Sorry I don't have the original email, so this from Gmane:
http://article.gmane.org/gmane.comp.lang.r.general/159957 ]


> dear all,
> 
> could somebody tell me how I can plot mild outliers as a circle(°) and
> extreme outliers as an asterisk(*) in a box-whisker plot?
> 
> Thanks very much in advance

The OP wants to use a different pch for the points that are plotted at
the extremes of the distribution of points. It is up to him, as I
pointed out, to define what "extreme" means in this context...

> 
> 	THIS CAN'T BE DONE!!!

...but the plotting he wanted to do, can be done! I showed one way it
could be done. Why do you insist he can't do what he wants?

> 
> There is no such argument specification.

Agreed (if you discount 'coef' of boxplot.stats() ), but the OP didn't
ask specifically for an argument. You appear to have taken a very narrow
view of what the OP wanted and then objected that someone had the
temerity to suggest the OP read the help pages which, as Bert pointed
out, do show you how to alter the plotting characters.

The whole point of R, for me at least, is that R Core and the wonderful
community of developeRs we are blessed with, provide me with the base
tools to do general statistical computing and add-on functionality for
specific tasks respectively, but because everything is open and R is a
programming language, I *can* do things, with a bit of extra effort, by
resorting to writing a few lines of extra code to build on what /is/
provided.

> 
> The response to which I objected implied that it *could* be
> done and would have had the OP tearing his hair out reading
> and re-reading the help pages and wondering what he was missing.
> 
> OF COURSE it ``can be done'' with some hacking.  You can do anything
> in R.  But it can't be done simply by specifying an argument to the
> given function.

Which only you seem to think the OP was requesting.

> 
> When I said that the appropriate response was ``it can't be done''
> I was being stylistically terse.  What I would actually have said
> is something like ``It can't be done directly; you'll need to do
> some coding to get the effect you want.  I'm not sure what coding,
> or how difficult it might be.''  Then somebody else could follow
> up with suggestions as to appropriate code, if they felt like it.
> 
> As to whether the hacking/coding is ``substantial'', that is indeed
> subjective.  What is a simple task for you would very likely be pretty
> daunting to the OP.

Agreed, which is why I felt it instructive and not a waste of my time to
provide /a/ solution when I responded to this thread.

>   If he could do it, he probably wouldn't have
> asked the question in the first place.
> 
> The response to which I objected was SERIOUSLY MISLEADING.  And
> therefore objectionable.  Full stop.

In your humble opinion, Rolf? Are you open to the possibility, however
remote that othes may have a different opinion to yours?

Regardless of whether you thought the original response was way of base
or not, you have publicly admonished the responder for providing an
answer on a public list to a rather vague question. You are entitled to
your views, of course, but you could have emailed the responder
off-list, in a polite and diplomatic manner and pointed out why you
thought their response was "SERIOUSLY MISLEADING", but instead, you wade
in with the public criticism. That is what I thought was off base, and
why I replied.

We've both had our says now and clearly we have different opinions on
what was asked, said, etc. I'd be happy to continue this off-list, but
for now I'm going to spare the inboxes of the other thousands of
R-HelpeRs and get on with the day job.

All the best,

G

> 
> 	cheers,
> 
> 		Rolf Turner
> 
> On 20/08/2009, at 10:53 AM, Gavin Simpson wrote:
> 
> > On Thu, 2009-08-20 at 09:58 +1200, Rolf Turner wrote:
> >> On 20/08/2009, at 9:39 AM, Gavin Simpson wrote:
> >>
> >> 	<snip>
> >>
> >>> Criticising correct, if cryptic or highlevel, responses to a list
> >>> where
> >>> people give their time for free, *and* not provide a more complete
> >>> solution is unfair, Rolf. The OP is free to respond and ask for
> >>> additional help once they've given it a go if they are still having
> >>> trouble..
> >>
> >> 	When the ``correct response'' is seriously misleading, as
> >> 	this one was --- the implication of the response was that
> >> 	the specified task *could* be done (if one looked hard
> >> 	enough at the help files), when in fact the specified task
> >> 	can't be done (at least not without substantial hacking)
> >
> > Hardly "substantial hacking" Rolf, and somewhat educational in regards
> > of the underlying functions used by boxplot. My suggestion is 9  
> > lines of
> > code, and it only stretches to 9 because I did each step in turn to  
> > make
> > it easier to understand/explain.
> >
> >> 	--- then I think criticism is merited.
> >>
> >> 	Also when a clear answer (``It can't be done.'') is as easy to
> >> 	give as an obscurantist misleading one (``RTFM'') then criticism
> >> 	is merited.
> >
> > Sorry Rolf, but "it can't be done" is somewhat subjective. All one is
> > doing is plotting a character on a graphics device at a certain
> > location, with the actual character determined on the basis of some a
> > priori determined indicator of "extreme" outlyingness. I showed how it
> > *could* be done, by manipulating the 'coef' argument of boxplot.sats 
> > (),
> > which works if you can couch your definition of "extreme" in terms of
> > the box height.
> >
> > ## install.package("fortunes")
> > require("fortunes")
> > fortune("this is R")
> >
> > :-)
> >
> > This is not to say that I necessarily think one should do this, but  
> > the
> > author of boxplot.stats must have envisaged a situation where you  
> > might
> > want to alter the definition of "outlier" (not that that is the right
> > word in this case as these observations are potentially just extreme,
> > not necessarily outliers). After all, all we are doing is determining
> > how far from the box centre we would like to start showing individual
> > observations.
> >
> > I admit that RTFM wasn't that helpful for a newbie ( said as much),  
> > but
> > replying with "it can't be done" is just as useless if not more so. In
> > this case one can do it if one has some definition of "extreme" that
> > allows you to determine which points, if any, to draw. Showing how  
> > that
> > can be done but wrapping it in suggestive language that this might not
> > be a "Good Idea" (TM) is better than your suggested response.
> >
> > G
> >
> >>
> >> 	There is a difference between saying RTFM to a poster who has
> >> 	clearly been too lazy to do his or her homework and saying RTFM
> >> 	to a poster when TFM is not at all clear with respect to the
> >> 	question posed.  There are so many arguments to bxp() that anyone
> >> 	might be forgiven for thinking ``There must be a way to do what
> >> 	I want; I just haven't twigged to the correct way of putting
> >> 	these arguments together.''  Deliberately steering a new user
> >> 	into such a misapprehension is unforgivable.
> >>
> >> 		cheers,
> >>
> >> 			Rolf Turner
> >>
> >> ##################################################################### 
> >> #
> >> Attention:
> >> This e-mail message is privileged and confidential. If you are not  
> >> the
> >> intended recipient please delete the message and notify the sender.
> >> Any views or opinions presented are solely those of the author.
> >>
> >> This e-mail has been scanned and cleared by MailMarshal
> >> www.marshalsoftware.com
> >> ##################################################################### 
> >> #
> > -- 
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> >  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
> >  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
> >  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
> >  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
> >  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> >
> 
> 
> ######################################################################
> Attention: 
> This e-mail message is privileged and confidential. If you are not the 
> intended recipient please delete the message and notify the sender. 
> Any views or opinions presented are solely those of the author.
> 
> This e-mail has been scanned and cleared by MailMarshal 
> www.marshalsoftware.com
> ######################################################################
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




More information about the R-help mailing list