[R] accumulative grouping of time series

Dennis Murphy djmuser at gmail.com
Mon Aug 15 20:46:16 CEST 2011


Hi:

Thank you for the reproducible example and expected result. Here's one approach:

library('plyr')
x <- data.frame(a=1:10, b=11:20, t=c(1,1,1,2,2,2,3,3,3,3))

x$sdif <- cumsum(with(x, a - b))
subfun <- function(d) tail(d[c('t', 'sdif')], 1)
ddply(x, 't', subfun)

  t sdif
1 1  -30
2 2  -60
3 3 -100

HTH,
Dennis

2011/8/15 Ernest Adrogué <nfdisco at gmail.com>:
> HI there,
>
> Consider a data set like this:
>
>> x <- data.frame(a=1:10, b=11:20, t=c(1,1,1,2,2,2,3,3,3,3))
>> x
>    a  b t
> 1   1 11 1
> 2   2 12 1
> 3   3 13 1
> 4   4 14 2
> 5   5 15 2
> 6   6 16 2
> 7   7 17 3
> 8   8 18 3
> 9   9 19 3
> 10 10 20 3
>
> Here x$t is a vector of integers that represent a moment
> in time. I would like to calculate a function of a & b at
> each moment (t0), but using the rows corresponding not only
> to moment t0 but also all moments t < t0.
>
> For example, if the function was f(a,b) = sum(a - b), the
> result would be
>
> t    f
> 1  -30           # (1-11) + (2-12) + (3-13)
> 2  -60
> 3 -100
>
> As far as I know there is no built-in function in R to
> group rows like this. The naive approach of using a loop is
> doable but extremely slow even for small data sets.
>
> result <- NULL
> for (i in unique(x$t)) {
>  part <- x[x$t <= i,]
>  result <- rbind(result, sum(part$a + part$b))
> }
>
> So, any suggestions?
>
> Note: in this example, it is possible to calculate f() for
> each subset using by() and then accumulate the results, but
> with other functions this won't work.
>
> Cheers,
> Ernest
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list