[R] block statistics with POSIX classes
Gabor Grothendieck
ggrothendieck at myway.com
Thu Sep 23 19:03:32 CEST 2004
Kahra Hannu <kahra <at> mpsgr.it> writes:
:
: I have followed Gabor's instructions:
:
: > aggregate(list(y=y), list(dp$year), mean)$y #
returns NULL since y is a time series
: NULL
:
: > aggregate(list(y=as.vector(y)), list(dp$year), mean)$y # returns
annual means
: [1] 0.0077656696 0.0224050294 0.0099991898 0.0240550925 -0.0084085867
: [6] -0.0170950194 -0.0355641251 0.0065873997 0.0008253111
:
: > aggregate(list(y=y), list(dp$year), mean) # returns the
same as the previous one
: Group.1 Series.1
: 1 96 0.0077656696
: 2 97 0.0224050294
: 3 98 0.0099991898
: 4 99 0.0240550925
: 5 100 -0.0084085867
: 6 101 -0.0170950194
: 7 102 -0.0355641251
: 8 103 0.0065873997
: 9 104 0.0008253111
:
: Gabor's second suggestion returns different results:
:
: > aggregate(ts(y, start=c(dp$year[1],dp$mon[1]+1), freq = 12), nfreq=1, mean)
: Time Series:
: Start = 96.33333
: End = 103.3333
: Frequency = 1
: Series 1
: [1,] 0.016120895
: [2,] 0.024257131
: [3,] 0.007526997
: [4,] 0.017466118
: [5,] -0.016024846
: [6,] -0.017145159
: [7,] -0.036047765
: [8,] 0.014198501
:
: > aggregate(y, 1, mean) # verifies the result above
: Time Series:
: Start = 1996.333
: End = 2003.333
: Frequency = 1
: Series 1
: [1,] 0.016120895
: [2,] 0.024257131
: [3,] 0.007526997
: [4,] 0.017466118
: [5,] -0.016024846
: [6,] -0.017145159
: [7,] -0.036047765
: [8,] 0.014198501
:
: The data is from 1996:5 to 2004:8. The difference of the results must depend
on the fact that the beginning of
: the data is not January and the end is not December? The first two solutions
give nine annual means while the
: last two give only eight means. The block size in the last two must be 12
months, as is said in ?aggregate,
: instead of a calender year that I am looking for. Gabor's first suggestion
solved my problem.
Yes, that seems to be the case. Using length instead of
mean we find that the aggregate.data.frame example used calendar
years as the basis of aggregation whereas the aggregate.ts example
used successive 12 month periods starting from the first month discarding
the 4 points at the end which do not fill out a full year.
R> set.seed(1)
R> dp <- as.POSIXlt(seq(from=as.Date("1996-5-1"), to=as.Date("2004-8-1"),
+ by="month"))
R> y <- rnorm(length(dp$year))
R> aggregate(list(y=y), list(dp$year), length)$y
[1] 8 12 12 12 12 12 12 12 8
R> aggregate(ts(y, start=c(dp$year[1],dp$mon[1]+1), freq = 12), nfreq=1,
length)
Time Series:
Start = 96.33333
End = 103.3333
Frequency = 1
[1] 12 12 12 12 12 12 12 12
More information about the R-help
mailing list