[R] Feature request: add boxplot()s to current plot (given x[i])
David James
dj at research.bell-labs.com
Fri Dec 10 17:58:52 CET 1999
Hi,
I experimented with a set of S functions to do "generalized" boxplots
sometime ago (e.g., "vase" or "violin, "diamond" plots, etc). There's
code to draw these gboxplots at arbitrary positions. Please take a look
at the help below and let me know if you'd like either to port
it R or scavange some of the code.
David A James Phone: (908) 582-3082
Bell Labs, Lucent Technologies Fax: (908) 582-3340
600 Mountain Ave Email: dj at bell-labs.com
Murray Hill, NJ 07974
--------------------------------------------------------------------
Generalized Box Plots
USAGE:
gboxplot(..., type = "box", range.=,
width=, varwidth=F, notch=F, names.=, horiz = T,
fill=F, col=1, old = T, plot.it=TRUE)
gboxplot(..., type = "vase", from=, to=, kernel.width=, n=,
width=, varwidth=F, notch=F, names.=, horiz = T,
fill=F, col=1, plot.it=TRUE)
gboxplot(..., type = "diamond", width=, varwidth=F,
names.=, horiz = T, fill=F, col=1, pch, plot.it=TRUE)
gboxplot(..., type = "pts", jitter.pts = F,
width=, varwidth=F, notch=F, names.=, horiz = T,
fill=F, col=1, pch, plot.it=TRUE)
ARGUMENTS:
...: vectors or a list containing a number of numeric compo-
nents (e.g., the output of split'). Missing values (NAs)
are allowed.
type=: character string (the first letter suffices) specifying
type of gboxplots, currently "box" for Tukey's boxplots,
"vase" for vase or violin plots (see Benjamini (1988) and
Hintze and Nelson (1998)), "diamond" for diamond plots
(see, for instance, JMP (1995)), or "pts" for one dimen-
sional histograms (See Chambers et. al. (1983)). Type may
also be the name of a user-written function that computes
an object for which there exists a draw' method. For in-
stance type = cmp.vase' specifies the function that com-
putes vases and which returns an object of class "vase";
the method draw.vase' (automatically called by the generic
draw') plots the vases.
range.=: controls the strategy for the whiskers and the de-
tached points beyond the whiskers. By default, whiskers
are drawn to the nearest value not beyond a standard range
from the quartiles; points beyond are drawn individually.
Giving range.=0' forces whiskers to the full data range.
Any positive value of range.' multiplies the standard
range by this amount. The standard range is 1.5*(inter-
quartile range).
width=: vector of relative box widths. See also argument var-
width'.
varwidth=: if TRUE', box widths will be proportional to the
square-root of the number of observations for the box.
notch=: if TRUE', notched boxes are drawn, where non-overlap-
ping of notches of boxes indicates a difference at a rough
5% significance level.
names.=: optional character vector of names for the groups.
If omitted, names used in labeling the plot will be taken
from the names of the arguments and from the names at-
tribute of lists.
plot.it=: if TRUE', the box plot will be produced; otherwise,
the calculated summaries of the arguments are returned.
old=: if TRUE', the plot will be produced in the style de-
scribed in the Tukey (1977) reference; otherwise, the plot
will follow the more modern style introduced in Tukey
(1990), where the advantages of the new style are de-
scribed.
horiz=: if TRUE boxes are drawn horizontally.
from: (vaseplots) lower bound for the percent of data to use
in fitting density'. By default 0.25.
to: (vaseplots) upper bound for the percent of data to use in
fitting density'. By default 0.75.
kernel.width: (vaseplots) width of the kernel window, as de-
fined in the function density'. Its default corresponds
to the width of a histogram bar as computed by Doane's
rule.
n: (vaseplot) number of equally spaced density estimates. De-
fault is 25.
jitter.pts: (pts) if logical, it specifies whether or not to
jitter the points inside each group. If numeric it speci-
fies the amount, in data units, to jitter. Default is
FALSE.
fill: should boxes or vases be filled? Default is FALSE.
col: vector of colors for each group.
pch: vector of plotting character for each group.
Graphical parameters may also be supplied as arguments to
this function (see par)
VALUE:
If plot.it' is FALSE', the value is a list as long as
there are data vectors with the components listed below.
Otherwise the generic function draw' is invoked with these
components, plus optional width', varwidth' and notch', to
produce the plot. Note that draw' returns a list of
box/vase centers.
stats: vector giving the upper extreme, upper quartile, medi-
an, lower quartile, and lower extreme for each box.
n: the number of observations in each group.
conf: vector giving confidence limits for the median.
out: vector of outlying points.
names.: names for each box (see argument names.' above).
dnsty: a list with x' and y' components as output by density.
NOTES:
In the case of vase plots, the density is estimated using
kernel smoothing --- this was done for expediency, but
other density estimates may be easily added (e.g., local
polynomial fitting as in Loader (1996)). Also note that
the aspect ratio of the density traces may be such that it
distorts important features of the data.
REFERENCE:
Tukey, John W., Exploratory Data Analysis, Addison-Wesley,
Reading, Mass., 1977.
Tukey, John W., "Data Based Graphics: Visual Display in
the Decades to Come", Statistical Science, pp. 327-339,
1990.
Benjamini, Yoav "Opening the Box of a Boxplot" The Ameri-
can Statistician, pp. 257-262, 1988.
Chamber, J. M., Cleveland, W. S., Klein, P., and Tukey, P.
A. Graphical Methods for Data Analysis, Wadsworth, Pacif-
ic Grove, CA., 1983.
Hintze, Jerry L., and Nelson, Ray D., "Violin Plots: A Box
Plot-Density Trace Synergism", The American Statistician,
pp. 181-184, 1998.
Loader, Clive, "Local Likelihood Density Estimation", An-
nals of Statistics, 1996.
"JMP User's Guide", SAS Institute Inc., 1995.
EXAMPLES:
gboxplot(group1,group2,group3)
gboxplot(split(salary,age),varwidth=TRUE,notch=TRUE)
# the example plot is produced by:
gboxplot(
split(lottery.payoff,lottery.number%/%100),
main=lottery.label,
sub="Leading Digit of Winning Numbers",
ylab="Payoff")
gboxplot( split(Mileage, Type), type = "vase", col=1:6)
gboxplot( split(Mileage, Type), type = "pts", jitter=3, col=1:6)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list