# [R] mild and extreme outliers in boxplot

Rnewbie
Thu Aug 20 01:27:01 CEST 2009

```I read the boxplot() help file and googled before making the post, and with
my little knowledge on R I was not able to plot in the way I wanted. That’s
why I made the post. Whether I can eventually solve the problem or not, I
appreciate very much any help.

I’m a very beginner of R, and found the R help forum a couple of weeks ago.
Since I thought I’m not among the major players of the forum and that the
post per se rather than poster is of concern like in any other public online
forum, I just registered with an arbitrarily chosen ID and kept using it. I
hope I haven't violated any rules because of this. I’m not making use of R
help for any commercial purposes whatsoever. I’m a master’s student working
on my thesis.

Jimmy

Gavin Simpson wrote:
>
> On Wed, 2009-08-19 at 13:49 -0700, Bert Gunter wrote:
>> Rolf:
>>
>> Not sure what "reasonably thorough" means but:
>>
>>  ? boxplot says:
>
> Exactly Bert, the info is there is you want to look and do so hard
> enough. However, it is perhaps expecting quite a lot of a new useR to
> put this together from ?boxplot or ?bxp, and ?boxplot.stats.
>
> Criticising correct, if cryptic or highlevel, responses to a list where
> people give their time for free, *and* not provide a more complete
> solution is unfair, Rolf. The OP is free to respond and ask for
> additional help once they've given it a go if they are still having
> trouble..
>
> One solution, if you are prepared to bastardise the standard
> interpretation of the boxplot, is to compute the relevant boxplot
> statistics using boxplot.stats and alter argument 'coef' to some larger
> multiple of the box height to represent "extreme" outliers, whatever
> those might be. So here's the rope, try not to hang yourself 'Rnewbie'!
>
> set.seed(1234)
> dat <- rt(100, df = 2)
> bxp1 <- boxplot.stats(dat)
> bxp2 <- boxplot.stats(dat, coef = 2)
>
> ##Then you'd need to plot the boxplot without outliers
>
> boxplot(dat, outpch = NA)
>
> ##Then plot the points 1.5-2 x box height
>
> want <- bxp1\$out %in% bxp2\$out
> out <- bxp1\$out
> out[want] <- NA
>
> points(rep(1, length(out)), out, pch = 1, col = "blue")
>
> ##Then the further outliers
>
> outout <- bxp2\$out
> points(rep(1, length(outout)), outout, pch = 2, col = "red")
>
> How one decides what is an outlier or an extreme outlier is another
> matter...? By chance the dummy data here shows one problem; there isn't
> much difference between 'outliers' and 'extreme outliers' towards the
> bottom of the resulting plot so why should we distinguish them?
>
> (By the way 'Rnewbie', this isn't something I recommend you do, but you
> might know more about your real world use case than I.)
>
> HTH
>
> G
>
> Ps; is there a reason why you post anonymously, 'Rnewbie'? Do you not
> want us to know who you are, but want our help?
>
>>
>> ...
>> pars    a list of (potentially many) more graphical parameters, e.g.,
>> boxwex
>> or outpch; these are passed to bxp (if plot is true); for details, see
>> there.
>>
>>
>> Well, that seems pretty clear to me, so I went to ?bxp to find in the
>> pars
>> listing:
>>
>> outlty, outlwd, outpch, outcex, outcol, outbg:
>> outlier line type, line width, point character, point size expansion,
>> color,
>> and background color. The default outlty= "blank" suppresses the lines
>> and
>> outpch=NA suppresses points.
>>
>>
>> It seems to me that this (and other omitted excerpts + examples) is at
>> least
>> a reasonable answer to the query (allowing the reader to at least infer
>> that
>> bxp does not distinguish degrees of outlyingness), so I don't understand
>> your criticism. Feel free to respond privately if you prefer.
>>
>> -- Bert
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatisics
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>> On
>> Behalf Of Rolf Turner
>> Sent: Wednesday, August 19, 2009 1:27 PM
>> To: ottorino-luca.pantani at unifi.it
>> Cc: Rnewbie; ERRE
>> Subject: Re: [R] mild and extreme outliers in boxplot
>>
>>
>> On 20/08/2009, at 3:13 AM, Ottorino-Luca Pantani wrote:
>>
>> > Rnewbie ha scritto:
>> >> dear all,
>> >>
>> >> could somebody tell me how I can plot mild outliers as a circle(°)
>> >> and
>> >> extreme outliers as an asterisk(*) in a box-whisker plot?
>> >>
>> >> Thanks very much in advance
>> >>
>> > ?boxplot
>> >
>> > or
>> >
>> > help(bxp)
>>
>> This is the sort of response that gives R-help a bad name.
>>
>> I had a reasonably thorough look at these help files and saw
>> ***nothing***
>> that would answer the OP's question.  The information may be there
>> --- I'm
>> to the appropriate lines of the help file(s) would be useful.
>>
>> 	cheers,
>>
>> 		Rolf Turner
>>
> --
>
>
>

```