[R] function by: order within subsets

Henrique Dallazuanna wwwhsd at gmail.com
Fri Jan 8 20:12:15 CET 2010


Try this:

transform(df, v = unlist(with(df, tapply(v, f, cumsum))))

On Fri, Jan 8, 2010 at 4:10 PM, Daniel Murphy <chiefmurphy at gmail.com> wrote:
> When the 'by' function forms subsets, are the rows in the same order as they
> are in the original data frame?
>
> For example, I want to use 'by' to calculate cumulative sums of a value 'v'
> by date 'd' for different levels of a factor 'f':
>
>>
> df<-data.frame(f=c("A","A","B"),d=as.Date(c("2010-1-1","2010-2-1","2010-1-1")),v=c(100,200,150))
>> df
>  f          d   v
> 1 A 2010-01-01 100
> 2 A 2010-02-01 200
> 3 B 2010-01-01 150
>> do.call(rbind,by(df,df$f,FUN=function(x)
> data.frame(x[1],x[2],cumsum(x[3]))))
>    f          d   v
> A.1 A 2010-01-01 100
> A.2 A 2010-02-01 300
> B   B 2010-01-01 150
>
> This is exactly what I want, namely, cumulative sums by date.
>
> Can I be sure that the rows within subset A will be arranged in date order
> as they are in the original data frame? I would not want 'by' to randomly
> switch the order and create, for example,
>    f          d   v
> A.1 A 2010-02-01 200
> A.2 A 2010-01-01 300
> B   B 2010-01-01 150
>
> I could force the order of each subset within the FUN of by, adding to the
> execution time. Would that be advised?
>
> Thanks,
>
> Dan
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O



More information about the R-help mailing list