[R] Doing operations by grouping variable

William Dunlap wdunlap at tibco.com
Tue Sep 21 17:52:18 CEST 2010


Have you tried using ave()?
  group <- rep(1:5,each=5)
  variable <- log(1:25)
  d <- data.frame(group, variable)
  d$scaled <- d$variable/with(d, ave(variable, group, FUN=max))

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Seth W Bigelow
> Sent: Tuesday, September 21, 2010 8:43 AM
> To: Bill.Venables at csiro.au
> Cc: r-help at r-project.org
> Subject: Re: [R] Doing operations by grouping variable
> 
> Thanks, Bill and Michael, you have answered the question I 
> asked, but not 
> the one I wished to ask
> I want to obtain the maximum in each group of variables, so I 
> could scale 
> each variable by the maximum for its group. If I use tapply, 
> as in the 
> example below, there's a mismatch in dimensions of the output 
> of tapply 
> [5] and the data frame with the variables[25]. 
> 
> 
> group = rep(1:5, each=5) # define grouping variable 
> 
> variable = rnorm(25)                                          
>           # 
> generate data
> 
> d <- data.frame(group,variable)                               
>           # 
> bundle together in a data frame
> 
> d$scaled <- d$variable/(with(d,tapply(variable,group,max)))   
>           # 
> crash and burn
> 
> 
> 
> 
> 
> Dr. Seth  W. Bigelow
> Biologist, USDA-FS Pacific Southwest Research Station
> 1731 Research Park Drive, Davis California
> 
> 
> 
> 
> <Bill.Venables at csiro.au> 
> 09/20/2010 06:24 PM
> 
> To
> <michael.bedward at gmail.com>, <sbigelow at fs.fed.us>, 
> <r-help at r-project.org>
> cc
> 
> Subject
> RE: [R] Doing operations by grouping variable
> 
> 
> 
> 
> 
> 
> That's if the variables are visible.  If they are only in the 
> data frame 
> it's not much more difficult
> 
> d <- data.frame(group = rep(1:5, each=5), 
>                 variable = rnorm(25))
> with(d, tapply(variable, group, max))
> 
> 
> (Tip: avoid using attach().)
> 
> Bill Venables. 
> 
> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] 
> On Behalf Of Michael Bedward
> Sent: Tuesday, 21 September 2010 11:15 AM
> To: Seth W Bigelow; Rhelp
> Subject: Re: [R] Doing operations by grouping variable
> 
> Not sure why you think tapply is "awkward". Your example would be...
> 
> group <- rep(1:5, each=5)
> variable <- rnorm(25)
> tapply(variable, group, max)
> 
> ...which looks quite elegant to me :)
> 
> Meanwhile, the reason your expression doesn't work is that you are
> asking mistakenly for elements 1:5 repeatedly from the variable col.
> If you just type d$variable[ d$group ] and compare the values to your
> variable vector this should be clear.
> 
> Michael
> 
> On 21 September 2010 10:59, Seth W Bigelow <sbigelow at fs.fed.us> wrote:
> > I'm writing an expression that requires searching a vector 
> according to
> > group. As an example, I want to find the maximum value in each of 5
> > groups.
> >
> >
> > group=rep(1:5, each=5)                          # create grouping 
> variable
> >
> > variable=rnorm(25)                              # generate data
> >
> > d <- data.frame(group,variable)                         # make data 
> frame
> >
> > max(d$variable[d$group])                        # try 
> expression that
> > doesn't work
> >
> > I'm expecting a vector containing the maximum variable 
> value, per group.
> > What am I doing wrong? I know I can use aggregate, tapply, 
> etc. but that
> > seems awkward and bulky, is there a simpler way?
> >
> >
> > Dr. Seth  W. Bigelow
> > Biologist, USDA-FS Pacific Southwest Research Station
> > 1731 Research Park Drive, Davis California
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list