[R] Summarize by two-column factor, retaining original factors

Gabor Grothendieck ggrothendieck at gmail.com
Fri Feb 24 17:34:03 CET 2006


Or even

aggregate(DF[3:4], DF[1:2], sum)



On 2/24/06, Marc Schwartz (via MN) <mschwartz at mn.rr.com> wrote:
> On Fri, 2006-02-24 at 08:18 -0800, Matt Crawford wrote:
> > I am having trouble doing the following.  I have a data.frame like
> > this, where x and y are a variable that I want to do calculations on:
> >
> > Name Year x y
> > ab   2001  15 3
> > ab   2001  10 2
> > ab   2002  12 8
> > ab   2003  7 10
> > dv   2002  10 15
> > dv   2002  3 2
> > dv   2003  1 15
> >
> > Before I do all the other things I need to do with this data, I need
> > to summarize or collapse the data by name and year.  I've found that I
> > can do things like
> > nameyear<-interaction(name,year)
> > dataframe$nameyear<-nameyear
> > tapply(dataframe$x,dataframe$nameyear,sum)
> > tapply(dataframe$y,dataframe$nameyear,sum)
> > and then bind those together.
> >
> > But my problem is that I need to somehow retain the original Names in
> > my collapsed dataset, so that later I can do analyses with the Name
> > factors.  All I can think of is something like
> > tapply(dataframe$Name,dataframe$nameyear, somefunction?)
> > but nothing seems to work.
> >
> > I'm actually trying to convert a SAS program, and I can't get out of
> > that mindset.  There, it's a simple Proc Means, By Name Year.
> >
> > Thanks for any help or suggestions on the right way to go about this.
> >
> > Matt Crawford
>
> Matt,
>
> Just use aggregate():
>
> > aggregate(MyDF[, 3:4], list(Name = MyDF$Name, Year = MyDF$Year), sum)
>  Name Year  x  y
> 1   ab 2001 25  5
> 2   ab 2002 12  8
> 3   dv 2002 13 17
> 4   ab 2003  7 10
> 5   dv 2003  1 15
>
>
> See ?aggregate for more information.
>
> HTH,
>
> Marc Schwartz
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list