[R] Whiskers on the default boxplot {graphics}
Shi, Tao
shidaxia at yahoo.com
Wed May 12 21:27:15 CEST 2010
Jason,
All these are clearly defined in the help file for 'boxplot' under 'range'. Don't understand how you missed that.
...Tao
----- Original Message ----
> From: Jason Rupert <jasonkrupert at yahoo.com>
> To: Dennis Murphy <djmuser at gmail.com>
> Cc: R Project Help <R-help at r-project.org>
> Sent: Wed, May 12, 2010 3:40:12 AM
> Subject: Re: [R] Whiskers on the default boxplot {graphics}
>
> Fantastic!
It would be great if the description could be modified to
> include the mysterious bit about the upper and lower bound whisker
> positions:
upper whisker = min(max(x), Q_3 + 1.5 * IQR)
lower whisker
> = max(min(x), Q_1 - 1.5 * IQR)
Maybe that is clearly written in the
> description of boxplot.stats {grDevices}, but evidently I missed it numerous
> times and also did not pick up on this intent from the original description of
> boxplot {graphics}.
Your type of descriptive answer and
> helpfulness is much appreciated and one of the reasons I continue to endorse the
> R tool over numerous others.
More like you and the tool may be
> headed for domination in the market.
Thanks
> again!
________________________________
From:
> Dennis Murphy <
> href="mailto:djmuser at gmail.com">djmuser at gmail.com>
Cc: R Project
> Help <
> href="mailto:R-help at r-project.org">R-help at r-project.org>
Sent: Wed,
> May 12, 2010 2:50:19 AM
Subject: Re: [R] Whiskers on the default boxplot
> {graphics}
Hi:
Let's do some math
> :)
e:
Okay...Let me see if I've got
> it...
>
>>I'm just trying to use the default boxplot {graphics}
> capability in R...
>
>>So I call something like the
> following:
>>> boxplot(mpg~cyl,data=mtcars, main="Car Milage Data",
> xlab="Number of Cylinders", ylab="Miles Per Gallon") \
>
>>That
> produces something as shown in the
> following:
>http://www.statmethods.net/graphs/images/boxplot1.jpg
>
>>When
> that default boxplot is called, i.e. boxplot {graphics}, as shown in the line of
> code above, it is actually calling into boxplot.stats {grDevices}. When
> boxplot.stats {grDevices} is called it has a default value for "coef" of 1.5,
> i.e. coef = 1.5.
>
>>If I understand the purpose of "coef"
> correctly, it means that the ‘whiskers’ should extend out 1.5 times the length
> of the box away from the box. Is that correct?
>
If by
> 'length of the box' you mean the interquartile range (IQR = Q_3 - Q_1 where Q
> refers to quartile), then assuming that
x is the numeric vector of interest
> for a boxplot,
upper whisker = min(max(x), Q_3 + 1.5 * IQR)
lower
> whisker = max(min(x), Q_1 - 1.5 * IQR)
So the upper whisker is located at
> the *smaller* of the maximum x value and Q_3 + 1.5 IQR,
whereas the lower
> whisker is located at the *larger* of the smallest x value and Q_1 - 1.5
> IQR.
In your terms, the whiskers should extend out a *maximum* of "1.5
> times the length of the box
away from the box".
Visually, this means
> that individual points more extreme in value than Q3 + 1.5 IQR are
> plotted
separately at the high end, and those below Q1 - 1.5 IQR are plotted
> separately on the low
end. Depending on the source, the separately plotted
> points are called 'outside values'. On
the other hand, if the maximum or
> minimum values of x are closer than 1.5 IQR in distance from
its nearest
> quartile, then that is where the whisker is positioned.
Does that make
> sense?
HTH,
Dennis
>>Now I look back at the plot, and
> I'm not sure how 1.5 times the length of the box corresponds with the whisker
> lengths shown in the image:
>
> href="http://www.statmethods.net/graphs/images/boxplot1.jpg" target=_blank
> >http://www.statmethods.net/graphs/images/boxplot1.jpg
>
>>Is
> it that the whisker length is a total of 1.5 the length of the box and centered
> about the median (2nd Quartile)?
>
>>Just trying to get a handle
> on this, so thanks again for all the help in deciphering
> this.
>
>
>
>
>
>
>
>>________________________________
>>From:
> RJ Cunningham <
> href="mailto:robut at iinet.net.au">robut at iinet.net.au>
>
>
> target="_blank" href="http://ast.net">ast.net>
>>Cc: R Project
> Help <
> href="mailto:R-help at r-project.org">R-help at r-project.org>
>>Sent:
> Tue, May 11, 2010 9:57:48 PM
>
>Subject: Re: [R] Whiskers on the
> default boxplot {graphics}
>
>
>I think not. Isn't the
> "secret" here?
>
>
>>Arguments:
>
>>x: a
> numeric vector for which the boxplot will be constructed
>>('NA's and
> 'NaN's are allowed and omitted).
>
>>coef: this determines how
> far the plot 'whiskers' extend out
>>from the box. If 'coef' is
> positive, the whiskers extend
>>to the most extreme data point which is
> no more than
>>'coef' times the length of the box away from the box.
> A
>>value of zero causes the whiskers to extend to the
> data
>>extremes (and no outliers be
> returned).
>
>>do.conf,do.out: logicals; if 'FALSE', the 'conf'
> or 'out'
>>component respectively will be empty in the
> result.
>
>>Details:
>
>>The two 'hinges' are
> versions of the first and third quartile,...
>
>
>>On Wed
> May 12 10:35 , Jason Rupert sent:
>
>
>>Humm....Maybe
> I need to look some place else than boxplot.stats {grDevices} for a definition
> of how the upper/lower whiskers are
> produced.
>>>
>>>>
>>>By any chance are
> they "the lowest datum still within 1.5 IQR of the lower quartile, and the
> highest datum still within 1.5 IQR of the upper
> quartile"?
>>>
>>>>
>>>None of the links
> from boxplot.stats {grDevices} seemed to reveal the secret definition of the R
> whiskers.
>>>
>>>>
>>>Thanks
> again.
>>>
>>>
>>>
>>>
>>>
>>>>
>>>-----
> Original Message
> ----
>>>>
>
>>>>
>
>>To:
> David Winsemius <
> href="mailto:dwinsemius at comcast.net">dwinsemius at comcast.net>
>>>>
>>>Cc:
> R Project Help <
> href="mailto:R-help at r-project.org">R-help at r-project.org>
>>>>
>>>Sent:
> Tue, May 11, 2010 9:26:25 PM
>>>>
>>>Subject: Re: [R]
> Whiskers on the default boxplot
> {graphics}
>>>
>>>>
>>>Wowzers...
>>>
>>>>
>>>From
> ?boxplot.stats:
>>>
>>>>
>>>Details
>>>
>>>>
>>The
> two ‘hinges’ are versions of the first and third quartile, i.e., close to
> quantile(x, c(1,3)/4). The hinges equal the quartiles for odd n (where n <-
> length(x)) and differ for even n. Whereas the quartiles only equal observations
> for n %% 4 == 1 (n = 1 mod 4), the hinges do so additionally for n %% 4 == 2 (n
> = 2 mod 4), and are in the middle of two observations
> otherwise.
>
>>
>>>>
>>>The notches
> (if requested) extend to +/-1.58 IQR/sqrt(n). This seems to be based on the same
> calculations as the formula with 1.57 in Chambers et al. (1983, p. 62), given in
> McGill et al. (1978, p. 16). They are based on asymptotic normality of the
> median and roughly equal sample sizes for the two medians being compared, and
> are said to be rather insensitive to the underlying distributions of the
> samples. The idea appears to be to give roughly a 95% confidence interval for
> the difference in two
> medians.
>>
>>
>>>
>>>
>>>>
>>>Is
> a notch equal to the upper/lower whisker? Is this just a difference of
> terminology or
> something?
>>>
>>>>
>>>Thanks again for
> all the
> insights.
>>>
>>>
>>>
>>>
>>>>
>>>-----
> Original Message ----
>>>>
>
>>From: David
> Winsemius <
> href="mailto:dwinsemius at comcast.net">dwinsemius at comcast.net>
>>>>
>
>>>>
>>>Cc:
> R Project Help <
> href="mailto:R-help at r-project.org">R-help at r-project.org>
>>>>
>>>Sent:
> Tue, May 11, 2010 9:00:15 PM
>>>>
>>>Subject: Re: [R]
> Whiskers on the default boxplot
> {graphics}
>>>
>>>
>>>>
>>>On
> May 11, 2010, at 9:45 PM, Jason Rupert
> wrote:
>>>
>>>>
>>>> How are the
> lower/upper whiskers defined in the default version of boxplot
> {graphics}?
>>>>
>>>>
>>>>
>>>
> I tried help(boxplot) and searching
> href="http://www.rseek.org">www.rseek.org, but I was unable to determine an
> absolute answer.
>
>>
>>>>
>>>You need
> to follow the links from the help pages and tin this case it appears that you
> did not follow the one
> to
>>>
>>>>
>>>?boxplot.stats
>>>
>>>>
>>>>
>>>>
>>>
> I checked out the definition of boxplot according to Wikipedia
> (http://en.wikipedia.org/wiki/Box_plot%5C), but it also had several
> approaches
>
>>>
>>>> listed for how the
> whiskers could be determined, so I'm just curious how the
> default
>>>>
>>>> boxplot {graphics} does
> it.
>>>>
>>>>
>>>>
>>>>
> Thanks for any
> feedback
>>>
>>>>
>>>Follow links with
> the R help system.
>>>
>>>>
>>>> and
> insights.
>>>
>>>
>>>
>>>>
>>>David
> Winsemius, MD
>>>>
>>>West Hartford,
> CT
>>>
>>>
>>>
>>>
>>>>
>>>______________________________________________
>
>>
> ymailto="mailto:R-help at r-project.org"
> href="mailto:R-help at r-project.org">R-help at r-project.org mailing
> list
>>>
> target=_blank
> >https://stat.ethz.ch/mailman/listinfo/r-help
>>>>
>>>PLEASE
> do read the posting guide
> http://www.R-project.org/posting-guide.html
>>>>
>>>and
> provide commented, minimal, self-contained, reproducible
> code.
>>>
>>>
>>>
>>>
>>>
>>>>
>>>______________________________________________
>>>
> ymailto="mailto:R-help at r-project.org"
> href="mailto:R-help at r-project.org">R-help at r-project.org mailing
> list
>>>
> target=_blank
> >https://stat.ethz.ch/mailman/listinfo/r-help
>>>>
>>>PLEASE
> do read the posting guide
> target=_blank
> >http://www.R-project.org/posting-guide.html
>>>>
>>>and
> provide commented, minimal, self-contained, reproducible
> code.
>>>
>
>
>
>
>
> [[alternative HTML version
> deleted]]
>
>
>______________________________________________
>
> ymailto="mailto:R-help at r-project.org"
> href="mailto:R-help at r-project.org">R-help at r-project.org mailing
> list
>
> >https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the
> posting guide
> target=_blank >http://www.R-project.org/posting-guide.html
>>and
> provide commented, minimal, self-contained, reproducible
> code.
>
>
> [[alternative HTML version deleted]]
More information about the R-help
mailing list