[R] Summarizing factor data in table?
Tony Plate
tplate at acm.org
Tue Apr 26 20:00:19 CEST 2005
Do you want to count the number of non-NA divisions and organizations in
the data for each year (where duplicates are counted as many times as
they appear)?
> tapply(!is.na(foo$div), foo$yr, sum)
1998 1999 2000
0 4 2
> tapply(!is.na(foo$org), foo$yr, sum)
1998 1999 2000
4 4 2
>
Or perhaps the number of unique non-NA divisions and organizations in
the data for each year?
> tapply(foo$div, foo$yr, function(x) length(na.omit(unique(x))))
1998 1999 2000
0 4 2
> tapply(foo$org, foo$yr, function(x) length(na.omit(unique(x))))
1998 1999 2000
4 4 2
>
(I don't understand where the "3" in your desired output comes from
though, which maybe indicates I completely misunderstand your request.)
Andy Bunn wrote:
> I have a very simple query with regard to summarizing the number of factors
> present in a certain snippet of a data frame.
> Given the following data frame:
>
> foo <- data.frame(yr = c(rep(1998,4), rep(1999,4), rep(2000,2)), div =
> factor(c(rep(NA,4),"A","B","C","D","A","C")),
> org = factor(c(1:4,1:4,1,2)))
>
> I want to get two new variables. Object ndiv would give the number of
> divisions by year:
> 1998 0
> 1999 3
> 2000 2
> Object norgs would give the number of organizations
> 1998 4
> 1999 4
> 2000 2
> I figure xtabs should be able to do it, but I'm stuck without a for loop.
> Any suggestions? -Andy
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list