spineplot {graphics} | R Documentation |
Spine Plots and Spinograms
Description
Spine plots are a special cases of mosaic plots, and can be seen as a generalization of stacked (or highlighted) bar plots. Analogously, spinograms are an extension of histograms.
Usage
spineplot(x, ...)
## Default S3 method:
spineplot(x, y = NULL,
breaks = NULL, tol.ylab = 0.05, off = NULL,
ylevels = NULL, col = NULL,
main = "", xlab = NULL, ylab = NULL,
xaxlabels = NULL, yaxlabels = NULL,
xlim = NULL, ylim = c(0, 1), axes = TRUE, weights = NULL, ...)
## S3 method for class 'formula'
spineplot(formula, data = NULL,
breaks = NULL, tol.ylab = 0.05, off = NULL,
ylevels = NULL, col = NULL,
main = "", xlab = NULL, ylab = NULL,
xaxlabels = NULL, yaxlabels = NULL,
xlim = NULL, ylim = c(0, 1), axes = TRUE, ...,
subset = NULL, weights = NULL, drop.unused.levels = FALSE)
Arguments
x |
an object, the default method expects either a single variable (interpreted to be the explanatory variable) or a 2-way table. See details. |
y |
a |
formula |
a |
data |
an optional data frame. |
breaks |
if the explanatory variable is numeric, this controls how
it is discretized. |
tol.ylab |
convenience tolerance parameter for y-axis annotation. If the distance between two labels drops under this threshold, they are plotted equidistantly. |
off |
vertical offset between the bars (in per cent). It is fixed to
|
ylevels |
a character or numeric vector specifying in which order the levels of the dependent variable should be plotted. |
col |
a vector of fill colors of the same length as |
main , xlab , ylab |
character strings for annotation |
xaxlabels , yaxlabels |
character vectors for annotation of x and y axis.
Default to |
xlim , ylim |
the range of x and y values with sensible defaults. |
axes |
logical. If |
weights |
numeric. A vector of frequency weights for each
observation in the data. If |
... |
additional arguments passed to |
subset |
an optional vector specifying a subset of observations to be used for plotting. |
drop.unused.levels |
should factors have unused levels dropped?
Defaults to |
Details
spineplot
creates either a spinogram or a spine plot. It can
be called via spineplot(x, y)
or spineplot(y ~ x)
where
y
is interpreted to be the dependent variable (and has to be
categorical) and x
the explanatory variable. x
can be
either categorical (then a spine plot is created) or numerical (then a
spinogram is plotted). Additionally, spineplot
can also be
called with only a single argument which then has to be a 2-way table,
interpreted to correspond to table(x, y)
.
Both, spine plots and spinograms, are essentially mosaic plots with
special formatting of spacing and shading. Conceptually, they plot
P(y | x)
against P(x)
. For the spine plot (where both
x
and y
are categorical), both quantities are approximated
by the corresponding empirical relative frequencies. For the
spinogram (where x
is numerical), x
is first discretized
(by calling hist
with breaks
argument) and then
empirical relative frequencies are taken.
Thus, spine plots can also be seen as a generalization of stacked bar
plots where not the heights but the widths of the bars corresponds to
the relative frequencies of x
. The heights of the bars then
correspond to the conditional relative frequencies of y
in
every x
group. Analogously, spinograms extend stacked
histograms.
Value
The table visualized is returned invisibly.
Author(s)
Achim Zeileis Achim.Zeileis@R-project.org
References
Friendly, M. (1994). Mosaic displays for multi-way contingency tables. Journal of the American Statistical Association, 89, 190–200. doi:10.2307/2291215.
Hartigan, J.A., and Kleiner, B. (1984). A mosaic of television ratings. The American Statistician, 38, 32–35. doi:10.2307/2683556.
Hofmann, H., Theus, M. (2005), Interactive graphics for visualizing conditional distributions. Unpublished Manuscript.
Hummel, J. (1996). Linked bar charts: Analysing categorical data graphically. Computational Statistics, 11, 23–33.
See Also
Examples
## treatment and improvement of patients with rheumatoid arthritis
treatment <- factor(rep(c(1, 2), c(43, 41)), levels = c(1, 2),
labels = c("placebo", "treated"))
improved <- factor(rep(c(1, 2, 3, 1, 2, 3), c(29, 7, 7, 13, 7, 21)),
levels = c(1, 2, 3),
labels = c("none", "some", "marked"))
## (dependence on a categorical variable)
(spineplot(improved ~ treatment))
## applications and admissions by department at UC Berkeley
## (two-way tables)
(spineplot(marginSums(UCBAdmissions, c(3, 2)),
main = "Applications at UCB"))
(spineplot(marginSums(UCBAdmissions, c(3, 1)),
main = "Admissions at UCB"))
## NASA space shuttle o-ring failures
fail <- factor(c(2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1,
1, 1, 1, 2, 1, 1, 1, 1, 1),
levels = c(1, 2), labels = c("no", "yes"))
temperature <- c(53, 57, 58, 63, 66, 67, 67, 67, 68, 69, 70, 70,
70, 70, 72, 73, 75, 75, 76, 76, 78, 79, 81)
## (dependence on a numerical variable)
(spineplot(fail ~ temperature))
(spineplot(fail ~ temperature, breaks = 3))
(spineplot(fail ~ temperature, breaks = quantile(temperature)))
## highlighting for failures
spineplot(fail ~ temperature, ylevels = 2:1)