[R] mild and extreme outliers in boxplot

Gabor Grothendieck ggrothendieck at gmail.com
Thu Aug 20 02:57:38 CEST 2009


I agree its not completely obvious from that answer but that
does not mean the responder deserves to be attacked.

The circle part is actually the default and albeit with difficulty
the help files do give the info we need to produce this:

bp <- boxplot(c(1:50, 80, 100, 200), outpch = NA)
with(bp, points(group, out, pch = c(1, 1, 8)))

which can be suitably generalized for other situations.

On Wed, Aug 19, 2009 at 8:31 PM, Rolf Turner<r.turner at auckland.ac.nz> wrote:
>
>
> I despair.  Why do you keep insisting that black is white?
> The OP wanted to be able to specify an argument to boxplot()
> that would cause it to plot mild and extreme outliers with
> different symbols.
>
>        THIS CAN'T BE DONE!!!
>
> There is no such argument specification.
>
> The response to which I objected implied that it *could* be
> done and would have had the OP tearing his hair out reading
> and re-reading the help pages and wondering what he was missing.
>
> OF COURSE it ``can be done'' with some hacking.  You can do anything
> in R.  But it can't be done simply by specifying an argument to the
> given function.
>
> When I said that the appropriate response was ``it can't be done''
> I was being stylistically terse.  What I would actually have said
> is something like ``It can't be done directly; you'll need to do
> some coding to get the effect you want.  I'm not sure what coding,
> or how difficult it might be.''  Then somebody else could follow
> up with suggestions as to appropriate code, if they felt like it.
>
> As to whether the hacking/coding is ``substantial'', that is indeed
> subjective.  What is a simple task for you would very likely be pretty
> daunting to the OP.  If he could do it, he probably wouldn't have
> asked the question in the first place.
>
> The response to which I objected was SERIOUSLY MISLEADING.  And
> therefore objectionable.  Full stop.
>
>        cheers,
>
>                Rolf Turner
>
> On 20/08/2009, at 10:53 AM, Gavin Simpson wrote:
>
>> On Thu, 2009-08-20 at 09:58 +1200, Rolf Turner wrote:
>>>
>>> On 20/08/2009, at 9:39 AM, Gavin Simpson wrote:
>>>
>>>        <snip>
>>>
>>>> Criticising correct, if cryptic or highlevel, responses to a list
>>>> where
>>>> people give their time for free, *and* not provide a more complete
>>>> solution is unfair, Rolf. The OP is free to respond and ask for
>>>> additional help once they've given it a go if they are still having
>>>> trouble..
>>>
>>>        When the ``correct response'' is seriously misleading, as
>>>        this one was --- the implication of the response was that
>>>        the specified task *could* be done (if one looked hard
>>>        enough at the help files), when in fact the specified task
>>>        can't be done (at least not without substantial hacking)
>>
>> Hardly "substantial hacking" Rolf, and somewhat educational in regards
>> of the underlying functions used by boxplot. My suggestion is 9 lines of
>> code, and it only stretches to 9 because I did each step in turn to make
>> it easier to understand/explain.
>>
>>>        --- then I think criticism is merited.
>>>
>>>        Also when a clear answer (``It can't be done.'') is as easy to
>>>        give as an obscurantist misleading one (``RTFM'') then criticism
>>>        is merited.
>>
>> Sorry Rolf, but "it can't be done" is somewhat subjective. All one is
>> doing is plotting a character on a graphics device at a certain
>> location, with the actual character determined on the basis of some a
>> priori determined indicator of "extreme" outlyingness. I showed how it
>> *could* be done, by manipulating the 'coef' argument of boxplot.sats(),
>> which works if you can couch your definition of "extreme" in terms of
>> the box height.
>>
>> ## install.package("fortunes")
>> require("fortunes")
>> fortune("this is R")
>>
>> :-)
>>
>> This is not to say that I necessarily think one should do this, but the
>> author of boxplot.stats must have envisaged a situation where you might
>> want to alter the definition of "outlier" (not that that is the right
>> word in this case as these observations are potentially just extreme,
>> not necessarily outliers). After all, all we are doing is determining
>> how far from the box centre we would like to start showing individual
>> observations.
>>
>> I admit that RTFM wasn't that helpful for a newbie ( said as much), but
>> replying with "it can't be done" is just as useless if not more so. In
>> this case one can do it if one has some definition of "extreme" that
>> allows you to determine which points, if any, to draw. Showing how that
>> can be done but wrapping it in suggestive language that this might not
>> be a "Good Idea" (TM) is better than your suggested response.
>>
>> G
>>
>>>
>>>        There is a difference between saying RTFM to a poster who has
>>>        clearly been too lazy to do his or her homework and saying RTFM
>>>        to a poster when TFM is not at all clear with respect to the
>>>        question posed.  There are so many arguments to bxp() that anyone
>>>        might be forgiven for thinking ``There must be a way to do what
>>>        I want; I just haven't twigged to the correct way of putting
>>>        these arguments together.''  Deliberately steering a new user
>>>        into such a misapprehension is unforgivable.
>>>
>>>                cheers,
>>>
>>>                        Rolf Turner
>>>
>>> ######################################################################
>>> Attention:
>>> This e-mail message is privileged and confidential. If you are not the
>>> intended recipient please delete the message and notify the sender.
>>> Any views or opinions presented are solely those of the author.
>>>
>>> This e-mail has been scanned and cleared by MailMarshal
>>> www.marshalsoftware.com
>>> ######################################################################
>>
>> --
>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>>  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
>>  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
>>  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>>  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
>>  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
>> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>>
>
>
> ######################################################################
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list