[R] Partial aggregate on sorted data

Moisan Yves ymoisan at groupesm.com
Wed Oct 24 16:45:16 CEST 2007


Hi Jim,

Works indeed.  Thanx a lot!  It would be nice if those options were part of the aggregate function though just to have an easier way to play with the sorting and subset parameters.  Thanx again!

Yves 

P.S. there's a small typo :DEsCENDING :-)

-----Message d'origine-----
De : jim holtman [mailto:jholtman at gmail.com] 
Envoyé : 24 octobre 2007 09:48
À : Moisan Yves
Cc : r-help at r-project.org
Objet : Re: [R] Partial aggregate on sorted data

Is this something like you want:

> set.seed(1)
> test <- data.frame(value=runif(100), fact=sample(LETTERS[1:5], 100, TRUE))
> result <- tapply(test$value, test$fact, function(x, sort, subset){
+     x <- x[order(x, decreasing=(sort == "DECENDING"))]
+     mean(head(x, length(x) * subset))
+ }, sort="DECENDING", subset=.33)
> result
        A         B         C         D         E
0.8302502 0.8583468 0.7461504 0.7594074 0.9143997


On 10/24/07, Yves Moisan <ymoisan at groupesm.com> wrote:
>
> Hi All,
>
> I'm looking for ways to compute aggregate statistics (with the aggregate
> function) but with an option for sorting and selecting a subset of the data
> frame.  For example, I have would like to turn this :
>
> aggregate(myDataframe$TargetValue,list(SomeFactor =
> myDataframe$SomeFactor),mean)
>
> into something like
>
> aggregate(myDataframe$TargetValue,list(SomeFactor =
> myDataframe$SomeFactor),mean, sort=DESCENDING, subset=0.33)
>
> where sort would sort TargetValue per factor level and subset would be (for
> example) a value between 0 and 1.  The example above would give me the mean
> for the top third of TargetValue per factor.
>
> Any way of doing this without having to use temporary variables to stuff my
> vectors, use length(), etc ?
> --
> View this message in context: http://www.nabble.com/Partial-aggregate-on-sorted-data-tf4683988.html#a13384556
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list