[R] Impaired boxplot functionality - mean instead of median
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Thu Dec 1 23:05:10 CET 2005
Evgeniy Kachalin wrote:
> Marc Schwartz (via MN) ЃпЃиЃшЃеЃт:
>
>>>Marc Schwartz (via MN) ЃпЃиЃшЃеЃт:
>
>
>>>So plotmeans is incapable of: boxplot(numerical~fact1+fact2). Is there
>>>any way further?
>>
>>
>>I think that somehow we are talking past each other here.
>>
>>plotmeans() does what it is designed to do, which is to simplify the
>>process of plotting group-wise point estimates and user defined error
>>bars/intervals around the point estimates.
>>
>>In your case, these intervals would be standard deviations around each
>>of the group means as you have indicated.
>>
>>Review the examples in ?plotmeans.
>>
>>As Martin and others have pointed out, you need to remove boxplots from
>>the equation here, as they were not designed to plot means and standard
>>deviations.
>>
>
>
> Again, what I'm talking about: plotmeans is incapable of analyzing the
> formula. For example, I have two factors: A - a, b, c, and B - d, e, f.
>
> If i plot: boxplot(num~A+B) what do I get? Eight boxes: ad, ae, af, ba,
> be, bf, cd, ce, cf. If I plot: plotmeans(num~A+B) - what do I get?
> Nothing. Because plotmeans cannot combine two factors in various
> combination. Is there a simple way to do it?
>
> Anyway... That's wrong way, all what is neccessary is to have a boxplot
> with mean istead of median. Is there simple way to do it?
>
> Statistical software like Statistica 7.0 offers any possible combination
> of what "Boxplot" could mean. Is it possible to have only one
> modification to R's boxplot?
>
> Thank you for kind answers.
> Also please tell me, where should I send replies: to conference adress
> or to those who answer me directly.
>
library(Hmisc)
library(lattice)
?panel.bpplot
bwplot(...., panel=panel.bpplot)
By default, panel.bpplot shows the mean (dot) and median (line) plus
several quantiles. To bother Martin in a friendly way, I think that
means can be useful additions - not that they are so useful by
themselves, but that when they differ a lot from the median,
non-statisticians gain further information about asymmetry. Also, even
though the simple box plot is elegant, I sometimes think it has a high
ink to information ratio. I have gained a lot from seeing outer
quantiles on the plot, and I don't like to show outer points for fear of
someone labeling them outliers. For describing raw data distributions,
I never find standard deviations useful, however.
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list