[R] cbind in aggregate formula - based on an existing object (vector)

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Fri Jul 15 15:15:38 CEST 2011


THAT'S IT, Bill - exactly what I was looking for! Thanks a lot for the
input, everyone.
I find the "by" method the most straigtfoward and clear.
Dimitri


On Thu, Jul 14, 2011 at 5:12 PM, William Dunlap <wdunlap at tibco.com> wrote:
> You may find it easier to use the data.frame method for aggregate
> instead of the formula method when you are using vectors of column
> names.   E.g.,
>
>  responseVars <- c("mpg", "wt")
>  byVars <- c("cyl", "gear")
>  aggregate(mtcars[responseVars], by=mtcars[byVars], FUN=median)
>
> gives the same result as
>
>  aggregate(cbind(mpg, wt) ~ cyl + gear, FUN=median, data=mtcars)
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Dimitri Liakhovitski
> Sent: Thursday, July 14, 2011 1:45 PM
> To: David Winsemius
> Cc: r-help
> Subject: Re: [R] cbind in aggregate formula - based on an existing object (vector)
>
> Thanks a lot!
>
> actually, what I tried to do is very simple - just passing tons of
> variable names into the formula. Maybe that "get" thing suggested by
> Bert would work...
>
> Dimitri
>
>
> On Thu, Jul 14, 2011 at 4:01 PM, David Winsemius <dwinsemius at comcast.net> wrote:
>> Dmitri:
>>
>> as.matrix makes a matrix out of the dataframe that is passed to it.
>>
>> As a further note I attempted and failed for reasons that are unclear to me
>> to construct a formula that would (I hoped) preserve the column names which
>> are being mangle in the posted effort:
>>
>> form <- as.formula(paste(
>>                     "cbind(",
>>                      paste( myvars, collapse=","),
>>                      ") ~ group+mydate",
>>                      sep=" ") )
>>> myvars<-c("value1","value2")
>>> example.agg1<-aggregate(formula=form,data=example, FUN=sum)
>> Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable
>>> traceback()
>> 2: aggregate.formula(formula = form, data = example, FUN = sum)
>> 1: aggregate(formula = form, data = example, FUN = sum)
>>
>>> form
>> cbind(value1, value2) ~ group + mydate
>>> parse(text=form)
>> expression(~
>> cbind(value1, value2), group + mydate)
>>
>> So it seems to be correctly dispatched to aggregate.formula but not passing
>> some check or another. Also tried with formula() rather than as.formula with
>> identical error message. Also tried including without naming the argument.
>>
>> --
>> David
>>
>>
>> On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote:
>>
>>> Thank you, David, it does work.
>>> Could you please explain why? What exactly does changing it to "as matrix"
>>> do?
>>> Thank you!
>>> Dimitri
>>>
>>> On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius <dwinsemius at comcast.net>
>>> wrote:
>>>>
>>>> On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> I am aggregating using a formula in aggregate - of the type:
>>>>> aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)
>>>>>
>>>>> However, I actually have an object (vector of my variables to be
>>>>> aggregated):
>>>>> myvars<-c("var1","var2","var3")
>>>>>
>>>>> I'd like my aggregate formula (its "cbind" part) to be able to use my
>>>>> "myvars" object. Is it possible?
>>>>> Thanks for your help!
>>>>>
>>>>
>>>> Not sure I have gotten all the way there, but this does work:
>>>>
>>>>
>>>> example.agg1<-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example)
>>>>
>>>>> example.agg1
>>>>
>>>>  group     mydate example[myvars]    NA
>>>> 1 group1 2008-12-01               4   4.2
>>>> 2 group2 2008-12-01               6   6.2
>>>> 3 group1 2009-01-01              40  40.2
>>>> 4 group2 2009-01-01              60  60.2
>>>> 5 group1 2009-02-01             400 400.2
>>>> 6 group2 2009-02-01             600 600.2
>>>>
>>>>> Dimitri
>>>>>
>>>>> Reproducible example:
>>>>>
>>>>> mydate = rep(seq(as.Date("2008-12-01"), length = 3, by = "month"),4)
>>>>> value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
>>>>> value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)
>>>>>
>>>>> example<-data.frame(mydate=mydate,value1=value1,value2=value2)
>>>>>
>>>>>
>>>>> example$group<-c(rep("group1",3),rep("group2",3),rep("group1",3),rep("group2",3))
>>>>> example$group<-as.factor(example$group)
>>>>> (example);str(example)
>>>>>
>>>>>
>>>>>
>>>>> example.agg1<-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
>>>>> # this works
>>>>> (example.agg1)
>>>>>
>>>>> ### Building my object (vector of 2 names - in reality, many more):
>>>>> myvars<-c("value1","value2")
>>>>> example.agg1<-aggregate(cbind(myvars)~group+mydate,sum,data=example)
>>>>> ### does not work
>>>>>
>>>>>
>>>>> --
>>>>> Dimitri Liakhovitski
>>>>> Ninah Consulting
>>>>> www.ninah.com
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>> David Winsemius, MD
>>>> West Hartford, CT
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Dimitri Liakhovitski
>>> Ninah Consulting
>>> www.ninah.com
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>>
>
>
>
> --
> Dimitri Liakhovitski
> Ninah Consulting
> www.ninah.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com



More information about the R-help mailing list