[R] identify subsets based on two grouping factors
rajarshi.guha at gmail.com
Mon Jan 31 23:26:58 CET 2011
Indeed, tapply is what I needed. To clarify Phils' question, what I needed was
tapply(x, list(cut.grp1, cut.grp2), function(z) table(z))
On Mon, Jan 31, 2011 at 4:50 PM, Bert Gunter <gunter.berton at gene.com> wrote:
> ?tapply is the basic R function for this. There are many other packages
> (e.g. plyr) and functions (e.g. ave) that simplify and streamline this for
> more complicated applications.
> -- Bert
> On Mon, Jan 31, 2011 at 1:43 PM, Rajarshi Guha <rajarshi.guha at gmail.com>
>> Hi, I have a data.frame that has a categorical variable, for which I
>> would like to look at the distribution of levels of this variable,
>> based on a grouping of two other variables.
>> As an example:
>> x <- data.frame(obs=sample(c('low', 'high'),100, replace=TRUE),
>> grp1=sample(1:10, 100, replace=TRUE),
>> cut.grp1 <- cut(x$grp1, 3)
>> cut.grp2 <- cut(x$grp2, 3)
>> Thus, for each combination of levels in cut.grp1 and cut.grp2, I'd
>> like to obtain the distribution of levels obs. I know I can loop over
>> each pair of levels in cut.grp1 and cut.grp2, but is there a more
>> elegant way to achieve this?
>> Rajarshi Guha
>> NIH Chemical Genomics Center
>> R-help at r-project.org mailing list
>> PLEASE do read the posting guide
>> and provide commented, minimal, self-contained, reproducible code.
> Bert Gunter
> Genentech Nonclinical Biostatistics
NIH Chemical Genomics Center
More information about the R-help