[R] Noobie question on aggregate tapply and by

Sun Apr 25 19:47:04 CEST 2010

Here's one way with aggregate()

library(car)  # You probably will need to install it.

aggregate(DF[,3-4], by=list(years), mean,na.rm=TRUE)

recode(x, "c(1,2)='A'; else='B'")

DF$years <- recode(DF$years, "c(5,6,7)= '5-7'")

DF

You may also want to have a look at the reshape and plyr packages.

--- On Sun, 4/25/10, steven mosher <moshersteven at gmail.com> wrote:

> From: steven mosher <moshersteven at gmail.com>
> Subject: [R] Noobie question on aggregate tapply and by
> To: "r-help" <r-help at r-project.org>
> Received: Sunday, April 25, 2010, 2:29 AM
> I have a 43MB dataframe ( 5
> variables) and I'm trying to summarize subsets
> of the data.
> I've RTFM ( not very clear) and looked at a variety of
> samples but cant seem
> to figure out
> how to make these functions work.
> 
> A sample of what I want to do would be this:
> 
> ids<-seq(1,50)
>  years<-c(rep(5,10),rep(6,10),rep(7,10),rep(8,20))
> 
> data<-c(rep(23.2,7),rep(14.2,17),rep(29.2,6),rep(13.4,10),rep(16.3,5),
> NA,
> rep(40,4))
> data2<-c(rep(22.2,5),rep(13.2,8),NA,
> rep(29.8,16),rep(12.4,10),rep(16.3,5),
> rep(38,5))
>  DF<-data.frame(ids,years,data,data2)
> 
> That will give you a dataframe that is a good analog of
> what I have. i
> would like to calculate means
> ( with NA removed na.rm) for each level of years.
> 
>           data  data2
> 5         xx. 
>    yy.
> 6         xx 
>    yz
> 7         ... 
>    ,,,
> 8         ..   
>   ...
> 
> And then things like this:
> 
> 5-7 :   xx     yy
> 8   :    xy 
>    zz
> 
>     [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>