[R] Aggregrate function
David Winsemius
dwinsemius at comcast.net
Fri Feb 13 06:17:29 CET 2009
Actually what was originally requested is returned by:
xveg[xveg$tot %in% with(xveg, tapply(tot, loc, max)), c("loc","sp")]
--
David Winsemius
On Feb 12, 2009, at 5:59 PM, markleeds at verizon.net wrote:
> it does and you get exactly what monica wanted if you take out the
> "sp and just return the whole thing. thanks.
>
>
>
> On Thu, Feb 12, 2009 at 5:52 PM, David Winsemius wrote:
>
>> aggregate and by are convenience functions of tapply.
>>
>> Consider this alternate solution:
>>
>> xveg[which(xveg$tot %in% with(xveg, tapply(tot, loc, max))),"sp"]
>>
>> It uses tapply to find the maximums by loc(ations) and then to goes
>> back into xveg to find the corresponding sp(ecies). You should do
>> testing to see whether the handling of ties agrees with your needs.
>>
>> --
>> David Winsemius
>>
>> On Feb 12, 2:56 pm, "Christos Hatzis" <christos.hat... at nuverabio.com>
>> wrote:
>>> I don't have an easy solution with aggregate, because the function
>>> in
>>> aggregate needs to return a scalar.
>>> But the following should work:
>>>
>>> do.call("rbind", lapply(split(xveg, xveg$loc), function(x)
>>> x[which.max(x$tot), ]))
>>>
>>> loc sp tot
>>> L1 L1 b 60
>>> L2 L2 e 30
>>> L3 L3 b 68
>>>
>>> -Christos
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: r-help-boun... at r-project.org
>>>> [mailto:r-help-boun... at r-project.org] On Behalf Of Monica Pisica
>>>> Sent: Thursday, February 12, 2009 1:58 PM
>>>> To: R help project
>>>> Subject: [R] Aggregrate function
>>>
>>>> Hi,
>>>
>>>> I have to recognize that i don't fully understand the
>>>> aggregate function, but i think it should help me with what i
>>>> want to do.
>>>
>>>> xveg is a data.frame with location, species, and total for
>>>> the species. Each location is repeated, once for every
>>>> species present at that location. For each location i want to
>>>> find out which species has the maximum total ... so i've
>>>> tried different ways to do it using aggregate.
>>>
>>>> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2)); sp <-
>>>> c("a", "b", "c", "a", "d", "b", "e", "c", "b", "d"); tot <-
>>>> c(20, 60, 40, 15, 25, 10, 30, 20, 68, 32); xveg <-
>>>> data.frame(loc, sp, tot)
>>>
>>>> result desired:
>>>
>>>> L1 b
>>>> L2 e
>>>> L3 b
>>>
>>>> sp_maj <- aggregate(xveg[,2], list(xveg[,1], function(x)
>>>> levels(x)[which.max(table(x))])
>>>
>>>> This is wrong because it gives the first species name in each
>>>> level of location, so i get a, a, b, as species instead of b, e, b.
>>>
>>>> I've tried other few aggregate commands, all with wrong results.
>>>
>>>> I will appreciate any help,
>>>
>>>> Thanks,
>>>
>>>> Monica
>>>
>>>> _________________________________________________________________
>>>
>>>> the go.
>>>
>>>> ______________________________________________
>>>> R-h... at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/
>>> listinfo/r-help
>>> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list