[R] Doing operations by grouping variable
Bill.Venables at csiro.au
Bill.Venables at csiro.au
Wed Sep 22 00:14:55 CEST 2010
You left out the subscript. Why not just do
d <- within(data.frame(group = rep(1:5, each = 5), variable = rnorm(25)),
scaled <- variable/tapply(variable, group, max)[group])
and be done with it?
(Warning: if you replace the second '<-' above by '=', it will not work.
It is NOT true that you can always replace '<-' by '=' for assignment. Why?)
From: Seth W Bigelow [mailto:sbigelow at fs.fed.us]
Sent: Wednesday, 22 September 2010 1:43 AM
To: Venables, Bill (CMIS, Cleveland)
Cc: michael.bedward at gmail.com; r-help at r-project.org
Subject: RE: [R] Doing operations by grouping variable
Thanks, Bill and Michael, you have answered the question I asked, but not the one I wished to ask I want to obtain the maximum in each group of variables, so I could scale each variable by the maximum for its group. If I use tapply, as in the example below, there's a mismatch in dimensions of the output of tapply  and the data frame with the variables.
group <- rep(1:5, each=5) # define grouping variable
variable <- rnorm(25) # generate data
d <- data.frame(group,variable) # bundle together in a data frame
d$scaled <- d$variable/(with(d,tapply(variable,group,max))) # crash and burn
Dr. Seth W. Bigelow
Biologist, USDA-FS Pacific Southwest Research Station
1731 Research Park Drive, Davis California
<Bill.Venables at csiro.au>
09/20/2010 06:24 PM
<michael.bedward at gmail.com>, <sbigelow at fs.fed.us>, <r-help at r-project.org>
RE: [R] Doing operations by grouping variable
That's if the variables are visible. If they are only in the data frame it's not much more difficult
d <- data.frame(group = rep(1:5, each=5),
variable = rnorm(25))
with(d, tapply(variable, group, max))
(Tip: avoid using attach().)
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Michael Bedward
Sent: Tuesday, 21 September 2010 11:15 AM
To: Seth W Bigelow; Rhelp
Subject: Re: [R] Doing operations by grouping variable
Not sure why you think tapply is "awkward". Your example would be...
group <- rep(1:5, each=5)
variable <- rnorm(25)
tapply(variable, group, max)
...which looks quite elegant to me :)
Meanwhile, the reason your expression doesn't work is that you are
asking mistakenly for elements 1:5 repeatedly from the variable col.
If you just type d$variable[ d$group ] and compare the values to your
variable vector this should be clear.
On 21 September 2010 10:59, Seth W Bigelow <sbigelow at fs.fed.us> wrote:
> I'm writing an expression that requires searching a vector according to
> group. As an example, I want to find the maximum value in each of 5
> group=rep(1:5, each=5) # create grouping variable
> variable=rnorm(25) # generate data
> d <- data.frame(group,variable) # make data frame
> max(d$variable[d$group]) # try expression that
> doesn't work
> I'm expecting a vector containing the maximum variable value, per group.
> What am I doing wrong? I know I can use aggregate, tapply, etc. but that
> seems awkward and bulky, is there a simpler way?
> Dr. Seth W. Bigelow
> Biologist, USDA-FS Pacific Southwest Research Station
> 1731 Research Park Drive, Davis California
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list
> PLEASE do read the posting guide <https://stat.ethz.ch/mailman/listinfo/r-help> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
R-help at r-project.org mailing list
PLEASE do read the posting guide <https://stat.ethz.ch/mailman/listinfo/r-help> http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help