[R] cbind in aggregate formula - based on an existing object (vector)

peter dalgaard pdalgd at gmail.com
Fri Jul 15 15:06:55 CEST 2011


For a little lateral thinking, consider the use of "." on the LHS. That could play out as follows:

> myvars <- c("Ozone","Wind")
> f <- . ~ Month
> j <- union(all.vars(f[[3]]), myvars)
> aggregate(. ~ Month, data=airquality[j], mean, na.rm=T)
  Month    Ozone      Wind
1     5 23.61538 11.457692
2     6 29.44444 12.177778
3     7 59.11538  8.523077
4     8 59.96154  8.565385
5     9 31.44828 10.075862

(and of course, when you play with something unusual, a buglet pops up: it doesn't work with f instead of the explicit formula in the call to aggregate.)


On Jul 15, 2011, at 00:10 , Dennis Murphy wrote:

> Hi:
> 
> I think Bill's got the right idea for your problem, but for the fun of
> it, here's how Bert's suggestion would play out:
> 
> # Kind of works, but only for the first variable in myvars...
>> aggregate(get(myvars) ~ group + mydate, FUN = sum, data = example)
>   group     mydate get(myvars)
> 1 group1 2008-12-01           4
> 2 group2 2008-12-01           6
> 3 group1 2009-01-01          40
> 4 group2 2009-01-01          60
> 5 group1 2009-02-01         400
> 6 group2 2009-02-01         600
> 
> # Maybe sapply() with get as the function will work...
>> aggregate(sapply(myvars, get) ~ group + mydate, FUN = sum, data = example)
>   group     mydate myvars   get
> 1 group1 2008-12-01      4   4.2
> 2 group2 2008-12-01      6   6.2
> 3 group1 2009-01-01     40  40.2
> 4 group2 2009-01-01     60  60.2
> 5 group1 2009-02-01    400 400.2
> 6 group2 2009-02-01    600 600.2
> 
> Apart from the variable names, it matches example.agg1. OTOH, Bill's
> suggestion matches example.agg1 exactly and has an advantage in terms
> of code clarity:
> 
> byVars <- c('group', 'mydate')
>> aggregate(example[myvars], by = example[byVars], FUN = sum)
>   group     mydate value1 value2
> 1 group1 2008-12-01      4    4.2
> 2 group2 2008-12-01      6    6.2
> 3 group1 2009-01-01     40   40.2
> 4 group2 2009-01-01     60   60.2
> 5 group1 2009-02-01    400  400.2
> 6 group2 2009-02-01    600  600.2
> 
> FWIW,
> Dennis
> 
> On Thu, Jul 14, 2011 at 12:05 PM, Dimitri Liakhovitski
> <dimitri.liakhovitski at gmail.com> wrote:
>> Hello!
>> 
>> I am aggregating using a formula in aggregate - of the type:
>> aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)
>> 
>> However, I actually have an object (vector of my variables to be aggregated):
>> myvars<-c("var1","var2","var3")
>> 
>> I'd like my aggregate formula (its "cbind" part) to be able to use my
>> "myvars" object. Is it possible?
>> Thanks for your help!
>> 
>> Dimitri
>> 
>> Reproducible example:
>> 
>> mydate = rep(seq(as.Date("2008-12-01"), length = 3, by = "month"),4)
>> value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
>> value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)
>> 
>> example<-data.frame(mydate=mydate,value1=value1,value2=value2)
>> example$group<-c(rep("group1",3),rep("group2",3),rep("group1",3),rep("group2",3))
>> example$group<-as.factor(example$group)
>> (example);str(example)
>> 
>> example.agg1<-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
>> # this works
>> (example.agg1)
>> 
>> ### Building my object (vector of 2 names - in reality, many more):
>> myvars<-c("value1","value2")
>> example.agg1<-aggregate(cbind(myvars)~group+mydate,sum,data=example)
>> ### does not work
>> 
>> 
>> --
>> Dimitri Liakhovitski
>> Ninah Consulting
>> www.ninah.com
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list