Apply a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors.

tapply(X, INDEX, FUN = NULL, ..., simplify = TRUE)

`X` |
an atomic object, typically a vector. |

`INDEX` |
a |

`FUN` |
the function to be applied, or |

`...` |
optional arguments to |

`simplify` |
logical; if |

If `FUN`

is not `NULL`

, it is passed to
`match.fun`

, and hence it can be a function or a symbol or
character string naming a function.

When `FUN`

is present, `tapply`

calls `FUN`

for each
cell that has any data in it. If `FUN`

returns a single atomic
value for each such cell (e.g., functions `mean`

or `var`

)
and when `simplify`

is `TRUE`

, `tapply`

returns a
multi-way array containing the values, and `NA`

for the
empty cells. The array has the same number of dimensions as
`INDEX`

has components; the number of levels in a dimension is
the number of levels (`nlevels()`

) in the corresponding component
of `INDEX`

. Note that if the return value has a class (e.g., an
object of class `"Date"`

) the class is discarded.

Note that contrary to S, `simplify = TRUE`

always returns an
array, possibly 1-dimensional.

If `FUN`

does not return a single atomic value, `tapply`

returns an array of mode `list`

whose components are the
values of the individual calls to `FUN`

, i.e., the result is a
list with a `dim`

attribute.

When there is an array answer, its `dimnames`

are named by
the names of `INDEX`

and are based on the levels of the grouping
factors (possibly after coercion).

For a list result, the elements corresponding to empty cells are
`NULL`

.

Optional arguments to `FUN`

supplied by the `...`

argument
are not divided into cells. It is therefore inappropriate for
`FUN`

to expect additional arguments with the same length as
`X`

.

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
*The New S Language*.
Wadsworth & Brooks/Cole.

the convenience functions `by`

and
`aggregate`

(using `tapply`

);
`apply`

,
`lapply`

with its versions
`sapply`

and `mapply`

.

require(stats) groups <- as.factor(rbinom(32, n = 5, prob = 0.4)) tapply(groups, groups, length) #- is almost the same as table(groups) ## contingency table from data.frame : array with named dimnames tapply(warpbreaks$breaks, warpbreaks[,-1], sum) tapply(warpbreaks$breaks, warpbreaks[, 3, drop = FALSE], sum) n <- 17; fac <- factor(rep(1:3, length = n), levels = 1:5) table(fac) tapply(1:n, fac, sum) tapply(1:n, fac, sum, simplify = FALSE) tapply(1:n, fac, range) tapply(1:n, fac, quantile) ## example of ... argument: find quarterly means tapply(presidents, cycle(presidents), mean, na.rm = TRUE) ind <- list(c(1, 2, 2), c("A", "A", "B")) table(ind) tapply(1:3, ind) #-> the split vector tapply(1:3, ind, sum) ## Some assertions (not held by all patch propsals): nq <- names(quantile(1:5)) stopifnot( identical(tapply(1:3, ind), c(1L, 2L, 4L)), identical(tapply(1:3, ind, sum), matrix(c(1L, 2L, NA, 3L), 2, dimnames = list(c("1", "2"), c("A", "B")))), identical(tapply(1:n, fac, quantile)[-1], array(list(`2` = structure(c(2, 5.75, 9.5, 13.25, 17), .Names = nq), `3` = structure(c(3, 6, 9, 12, 15), .Names = nq), `4` = NULL, `5` = NULL), dim=4, dimnames=list(as.character(2:5)))))

