[R] help understanding box plots
Agustin Lobo
alobo at ija.csic.es
Fri Feb 22 15:24:59 CET 2002
I've always thought that it would most useful having a
graphic example of boxplot including some text
pointing to the main features of the boxplot
and that would define and explain these
features. Perhaps this
could be made a simple function (i.e., boxplot.example())
and this function be included in the help entry. Then
the user would just run boxplot.example() to
see a graphic and commented example. It's more
dificult to understand a text describing
the boxplot function than just seeing a commented
graphic example.
Agus
Dr. Agustin Lobo
Instituto de Ciencias de la Tierra (CSIC)
Lluis Sole Sabaris s/n
08028 Barcelona SPAIN
tel 34 93409 5410
fax 34 93411 0012
alobo at ija.csic.es
On 22 Feb 2002, Peter Dalgaard BSA wrote:
> Jay Pfaffman <pfaffman at relaxpc.com> writes:
>
> > Another naive stats question. I'm trying to better understand what
> > boxplots are telling me.
> >
> > I think what I see is the median and the boundaries of the 1st and 3rd
> > quartiles. The whiskers represent the range of the data unless there
> > are points which are outside "range" (default: 1.5) times the distance
> > from the median to that quartile. Is that right?
>
> Not quite. 1.5 times the length of the entire box.
>
> > I've read the
> > documentation for boxplot numerous times, but don't quite understand
> > it well enough to communicate it to my professor who's helping me with
> > this project. (You'll be relieved to know that neither of us fancies
> > ourself a statistician!)
>
> boxplot.stats.Rd had a typo and got updated recently in the
> development and patch versions to read
>
> \item{coef}{this determines how far the plot ``whiskers'' extend out
> from the box. If \code{coef} is positive, the whiskers extend to
> the
> most extreme data point which is no more than \code{coef} times
> the length of the box away from the box. A value of zero causes
> the whiskers
> to extend to the data extremes (and no outliers be returned).}
>
> (for some reason this hasn't yet found its way to the online snapshot
> manuals in http://stat.ethz.ch/R-alpha/R-devel/doc/html/ and friends.
> Martin?)
>
>
> > V&R (p. 122) claims that the hinges are "roughly quartiles," so
> > perhaps my naive understanding is close enough.
>
> Yes. The exact definition is slightly peculiar, but in compliance with
> the original definition by Tukey. So I'm told, anyway.
>
>
> > I've got a relatively small data set (n~=12). I think it would help
> > to see the data points plotted on top of the boxplots. Here's what
> > I'm doing now:
> >
> > par(las=2,ps=14,mar=c(15, 4, 4, 2))
> > boxplot(split(ranks,c(1:25)), names=items, notch=T, horizontal=F, add=F)
> >
> > If I could get the points of each of the 25 variables plotted on top
> > of the box, that'd be great.
>
> Not sure what you're doing there, but maybe some code like this could
> help:
>
> x1<-rnorm(20)
> x2<-rnorm(20)
> boxplot(list(x1=x1,x2=x2))
> points(cbind(1,x1))
> points(cbind(2,x2))
>
>
> --
> O__ ---- Peter Dalgaard Blegdamsvej 3
> c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list