[R] Whiskers on the default boxplot {graphics}
David Winsemius
dwinsemius at comcast.net
Wed May 12 06:03:09 CEST 2010
On May 11, 2010, at 11:55 PM, Jason Rupert wrote:
> Okay...Let me see if I've got it...
>
> I'm just trying to use the default boxplot {graphics} capability in
> R...
>
> So I call something like the following:
>> boxplot(mpg~cyl,data=mtcars, main="Car Milage Data", xlab="Number
>> of Cylinders", ylab="Miles Per Gallon") \
>
> That produces something as shown in the following:
> http://www.statmethods.net/graphs/images/boxplot1.jpg
>
> When that default boxplot is called, i.e. boxplot {graphics}, as
> shown in the line of code above, it is actually calling into
> boxplot.stats {grDevices}. When boxplot.stats {grDevices} is called
> it has a default value for "coef" of 1.5, i.e. coef = 1.5.
>
> If I understand the purpose of "coef" correctly, it means that the
> ‘whiskers’ should extend out 1.5 times the length of the box away
> from the box. Is that correct?
No. Read it again.
--
David.
>
> Now I look back at the plot, and I'm not sure how 1.5 times the
> length of the box corresponds with the whisker lengths shown in the
> image:
> http://www.statmethods.net/graphs/images/boxplot1.jpg
>
> Is it that the whisker length is a total of 1.5 the length of the
> box and centered about the median (2nd Quartile)?
>
> Just trying to get a handle on this, so thanks again for all the
> help in deciphering this.
>
>
>
>
>
>
>
> ________________________________
> From: RJ Cunningham <robut at iinet.net.au>
>
> ast.net>
> Cc: R Project Help <R-help at r-project.org>
> Sent: Tue, May 11, 2010 9:57:48 PM
> Subject: Re: [R] Whiskers on the default boxplot {graphics}
>
> I think not. Isn't the "secret" here?
>
>
> Arguments:
>
> x: a numeric vector for which the boxplot will be constructed
> ('NA's and 'NaN's are allowed and omitted).
>
> coef: this determines how far the plot 'whiskers' extend out
> from the box. If 'coef' is positive, the whiskers extend
> to the most extreme data point which is no more than
> 'coef' times the length of the box away from the box. A
> value of zero causes the whiskers to extend to the data
> extremes (and no outliers be returned).
>
> do.conf,do.out: logicals; if 'FALSE', the 'conf' or 'out'
> component respectively will be empty in the result.
>
> Details:
>
> The two 'hinges' are versions of the first and third quartile,...
>
>
> On Wed May 12 10:35 , Jason Rupert sent:
>
>
> Humm....Maybe I need to look some place else than boxplot.stats
> {grDevices} for a definition of how the upper/lower whiskers are
> produced.
>>
>>>
>> By any chance are they "the lowest datum still within 1.5 IQR of
>> the lower quartile, and the highest datum still within 1.5 IQR of
>> the upper quartile"?
>>
>>>
>> None of the links from boxplot.stats {grDevices} seemed to reveal
>> the secret definition of the R whiskers.
>>
>>>
>> Thanks again.
>>
>>
>>
>>
>>
>>>
>> ----- Original Message ----
>>>
>
>>>
>> To: David Winsemius <dwinsemius at comcast.net>
>>>
>> Cc: R Project Help <R-help at r-project.org>
>>>
>> Sent: Tue, May 11, 2010 9:26:25 PM
>>>
>> Subject: Re: [R] Whiskers on the default boxplot {graphics}
>>
>>>
>> Wowzers...
>>
>>>
>> From ?boxplot.stats:
>>
>>>
>> Details
>>
>>>
>> The two ‘hinges’ are versions of the first and third quartile,
>> i.e., close to quantile(x, c(1,3)/4). The hinges equal the
>> quartiles for odd n (where n <- length(x)) and differ for even n.
>> Whereas the quartiles only equal observations for n %% 4 == 1 (n =
>> 1 mod 4), the hinges do so additionally for n %% 4 == 2 (n = 2 mod
>> 4), and are in the middle of two observations otherwise.
>>
>>>
>> The notches (if requested) extend to +/-1.58 IQR/sqrt(n). This
>> seems to be based on the same calculations as the formula with 1.57
>> in Chambers et al. (1983, p. 62), given in McGill et al. (1978, p.
>> 16). They are based on asymptotic normality of the median and
>> roughly equal sample sizes for the two medians being compared, and
>> are said to be rather insensitive to the underlying distributions
>> of the samples. The idea appears to be to give roughly a 95%
>> confidence interval for the difference in two medians.
>>
>>
>>
>>>
>> Is a notch equal to the upper/lower whisker? Is this just a
>> difference of terminology or something?
>>
>>>
>> Thanks again for all the insights.
>>
>>
>>
>>
>>>
>> ----- Original Message ----
>>>
>> From: David Winsemius <dwinsemius at comcast.net>
>>>
>
>>>
>> Cc: R Project Help <R-help at r-project.org>
>>>
>> Sent: Tue, May 11, 2010 9:00:15 PM
>>>
>> Subject: Re: [R] Whiskers on the default boxplot {graphics}
>>
>>
>>>
>> On May 11, 2010, at 9:45 PM, Jason Rupert wrote:
>>
>>>
>>> How are the lower/upper whiskers defined in the default version of
>>> boxplot {graphics}?
>>>
>>>
>>>
>>> I tried help(boxplot) and searching www.rseek.org, but I was
>>> unable to determine an absolute answer.
>>
>>>
>> You need to follow the links from the help pages and tin this case
>> it appears that you did not follow the one to
>>
>>>
>> ?boxplot.stats
>>
>>>
>>>
>>>
>>> I checked out the definition of boxplot according to Wikipedia (http://en.wikipedia.org/wiki/Box_plot
>>> \), but it also had several approaches
>>>
>>> listed for how the whiskers could be determined, so I'm just
>>> curious how the default
>>>
>>> boxplot {graphics} does it.
>>>
>>>
>>>
>>> Thanks for any feedback
>>
>>>
>> Follow links with the R help system.
>>
>>>
>>> and insights.
>>
>>
>>
>>>
>> David Winsemius, MD
>>>
>> West Hartford, CT
>>
>>
>>
>>
>>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>
>>
>>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list