[R] subsetting, aggregating and zoo
Gabor Grothendieck
ggrothendieck at gmail.com
Sun Oct 29 19:23:40 CET 2006
Sorry, the line starting idx <- should have time(z) in place of z.
That is,
year <- as.Date(c(
"1988-01-13", "1988-01-14", "1988-01-16", "1988-01-20", "1988-01-21",
"1988-01-22", "1988-01-25", "1988-01-26", "1988-01-27", "1988-01-28"))
x <- c(
7.973946, 9.933518, 7.978227, 7.512960, 6.641862, 5.667780, 5.721358,
6.863729, 9.600000, 9.049846)
z <- zoo(x, year)
idx <- cumsum(c(1, diff(time(z)) != 1))
starts <- time(z)[match(idx, idx)]
ends <- time(z)[cumsum(table(idx))[idx]]
aggregate(z, starts, mean)
By the way, dput(v, control = "all") will output variable v
in a form easily pastable by someone else into their session.
On 10/29/06, antonio rodriguez <antonio.raju at gmail.com> wrote:
> Gabor Grothendieck escribió:
> > Try this:
> >
> > # test data
> > x <- c(1:4, 6:8, 10:14)
> > z <- zoo(x, as.Date(x))
> >
> > # idx is 1 for first run, 2 for second run, etc.
> > idx <- cumsum(c(1, diff(z) != 1))
> >
> > # starts replaces each time with the start time of that run
> > # ends is similar but for ends
> > starts <- time(z)[match(idx, idx)]
> > ends <- time(z)[cumsum(table(idx))[idx]]
> >
> > # average over each run using the time of the end of run for the result
> > # replace ends with starts if that is preferred
> > aggregate(z, ends, mean)
> >
> Yes it's OK in your example, but when I try to do it with my data I
> don't get the same figure.
>
> is.zoo(z)
> [1]TRUE
>
> atributes(z)
> $index
> [1] "1988-01-13" "1988-01-14" "1988-01-16" "1988-01-20" "1988-01-21"
> ..................................................................................................
> [3861] "2005-12-20" "2005-12-23" "2005-12-24" "2005-12-25" "2005-12-26"
> [3866] "2005-12-27" "2005-12-30"
>
> $class
> [1] "zoo"
>
> z[1:10]
>
> 1988-01-13 1988-01-14 1988-01-16 1988-01-20 1988-01-21 1988-01-22 1988-01-25
> 7.973946 9.933518 7.978227 7.512960 6.641862 5.667780 5.721358
> 1988-01-26 1988-01-27 1988-01-28
> 6.863729 9.600000 9.049846
>
> If I follow your instructions,
>
> idx <- cumsum(c(1, diff(z) != 1))
> starts <- time(z)[match(idx, idx)]
> ends <- time(z)[cumsum(table(idx))[idx]]
>
> s1 <- aggregate(z, starts, mean)
> s1[1:10]
>
> 1988-01-13 1988-01-14 1988-01-16 1988-01-20 1988-01-21 1988-01-22 1988-01-25
> 7.973946 9.933518 7.978227 7.512960 6.641862 5.667780 5.721358
> 1988-01-26 1988-01-27 1988-01-28
> 6.863729 9.600000 9.049846
>
> s2 <- aggregate(z, starts, mean)
> s2[1:10]
>
> 1988-01-13 1988-01-14 1988-01-16 1988-01-20 1988-01-21 1988-01-22 1988-01-25
> 7.973946 9.933518 7.978227 7.512960 6.641862 5.667780 5.721358
> 1988-01-26 1988-01-27 1988-01-28
> 6.863729 9.600000 9.049846
>
>
> Always the same. Don't know why (there are not NA's in the series)
>
> Antonio
>
>
>
>
>
>
>
More information about the R-help
mailing list