xtabs {stats}  R Documentation 
Cross Tabulation
Description
Create a contingency table (optionally a sparse matrix) from crossclassifying factors, usually contained in a data frame, using a formula interface.
Usage
xtabs(formula = ~., data = parent.frame(), subset, sparse = FALSE,
na.action, na.rm = FALSE, addNA = FALSE,
exclude = if(!addNA) c(NA, NaN), drop.unused.levels = FALSE)
## S3 method for class 'xtabs'
print(x, na.print = "", ...)
Arguments
formula 
a formula object with the crossclassifying variables
(separated by 
data 
an optional matrix or data frame (or similar: see

subset 
an optional vector specifying a subset of observations to be used. 
sparse 
logical specifying if the result should be a
sparse matrix, i.e., inheriting from

na.action 
a 
na.rm 
logical: should missing values on the lefthand side of the

addNA 
logical indicating if 
exclude 
a vector of values to be excluded when forming the set of levels of the classifying factors. 
drop.unused.levels 
a logical indicating whether to drop unused
levels in the classifying factors. If this is 
x 
an object of class 
na.print 
character string (or 
... 
further arguments passed to or from other methods. 
Details
There is a summary
method for contingency table objects created
by table
or xtabs(*, sparse = FALSE)
, which gives basic
information and performs a chisquared test for independence of
factors (note that the function chisq.test
currently
only handles 2d tables).
If a lefthand side is given in formula
, its entries are simply
summed over the cells corresponding to the righthand side; this also
works if the LHS does not give counts.
For variables in formula
which are factors, exclude
must be specified explicitly; the default exclusions will not be used.
In R versions before 3.4.0, e.g., when na.action = na.pass
,
sometimes zeroes (0
) were returned instead of NA
s.
In R versions before 4.4.0, when !addNA
as by default,
the default na.action
was na.omit
, effectively
treating missing counts as zero.
Value
By default, when sparse = FALSE
,
a contingency table in array representation of S3 class c("xtabs",
"table")
, with a "call"
attribute storing the matched call.
When sparse = TRUE
, a sparse numeric matrix, specifically an
object of S4 class
dgTMatrix
from package
Matrix.
See Also
table
for traditional crosstabulation, and
as.data.frame.table
which is the inverse operation of
xtabs
(see the DF
example below).
sparseMatrix
on sparse
matrices in package Matrix.
Examples
## 'esoph' has the frequencies of cases and controls for all levels of
## the variables 'agegp', 'alcgp', and 'tobgp'.
xtabs(cbind(ncases, ncontrols) ~ ., data = esoph)
## Output is not really helpful ... flat tables are better:
ftable(xtabs(cbind(ncases, ncontrols) ~ ., data = esoph))
## In particular if we have fewer factors ...
ftable(xtabs(cbind(ncases, ncontrols) ~ agegp, data = esoph))
## This is already a contingency table in array form.
DF < as.data.frame(UCBAdmissions)
## Now 'DF' is a data frame with a grid of the factors and the counts
## in variable 'Freq'.
DF
## Nice for taking margins ...
xtabs(Freq ~ Gender + Admit, DF)
## And for testing independence ...
summary(xtabs(Freq ~ ., DF))
## with NA's
DN < DF; DN[cbind(6:9, c(1:2,4,1))] < NA
DN # 'Freq' is missing only for (Rejected, Female, B)
(xtNA < xtabs(Freq ~ Gender + Admit, DN)) # NA prints 'invisibly'
print(xtNA, na.print = "NA") # show NA's better
xtabs(Freq ~ Gender + Admit, DN, na.rm = TRUE) # ignore missing Freq
## Use addNA = TRUE to tabulate missing factor levels:
xtabs(Freq ~ Gender + Admit, DN, addNA = TRUE)
xtabs(Freq ~ Gender + Admit, DN, addNA = TRUE, na.rm = TRUE)
## na.action = na.omit removes all rows with NAs right from the start:
xtabs(Freq ~ Gender + Admit, DN, na.action = na.omit)
## Create a nice display for the warp break data.
warpbreaks$replicate < rep_len(1:9, 54)
ftable(xtabs(breaks ~ wool + tension + replicate, data = warpbreaks))
###  Sparse Examples 
if(require("Matrix")) withAutoprint({
## similar to "nlme"s 'ergoStool' :
d.ergo < data.frame(Type = paste0("T", rep(1:4, 9*4)),
Subj = gl(9, 4, 36*4))
xtabs(~ Type + Subj, data = d.ergo) # 4 replicates each
set.seed(15) # a subset of cases:
xtabs(~ Type + Subj, data = d.ergo[sample(36, 10), ], sparse = TRUE)
## Hypothetical twolevel setup:
inner < factor(sample(letters[1:25], 100, replace = TRUE))
inout < factor(sample(LETTERS[1:5], 25, replace = TRUE))
fr < data.frame(inner = inner, outer = inout[as.integer(inner)])
xtabs(~ inner + outer, fr, sparse = TRUE)
})