# [R] block statistics with POSIX classes

Kahra Hannu kahra at mpsgr.it
Thu Sep 23 17:42:59 CEST 2004

I have followed Gabor's instructions:

> aggregate(list(y=y), list(dp\$year), mean)\$y 			# returns NULL since y is a time series
NULL

> aggregate(list(y=as.vector(y)), list(dp\$year), mean)\$y	# returns annual means
[1]  0.0077656696  0.0224050294  0.0099991898  0.0240550925 -0.0084085867
[6] -0.0170950194 -0.0355641251  0.0065873997  0.0008253111

> aggregate(list(y=y), list(dp\$year), mean)			# returns the same as the previous one
Group.1      Series.1
1      96  0.0077656696
2      97  0.0224050294
3      98  0.0099991898
4      99  0.0240550925
5     100 -0.0084085867
6     101 -0.0170950194
7     102 -0.0355641251
8     103  0.0065873997
9     104  0.0008253111

Gabor's second suggestion returns different results:

> aggregate(ts(y, start=c(dp\$year[1],dp\$mon[1]+1), freq = 12), nfreq=1, mean)
Time Series:
Start = 96.33333
End = 103.3333
Frequency = 1
Series 1
[1,]  0.016120895
[2,]  0.024257131
[3,]  0.007526997
[4,]  0.017466118
[5,] -0.016024846
[6,] -0.017145159
[7,] -0.036047765
[8,]  0.014198501

> aggregate(y, 1, mean) 		# verifies the result above
Time Series:
Start = 1996.333
End = 2003.333
Frequency = 1
Series 1
[1,]  0.016120895
[2,]  0.024257131
[3,]  0.007526997
[4,]  0.017466118
[5,] -0.016024846
[6,] -0.017145159
[7,] -0.036047765
[8,]  0.014198501

The data is from 1996:5 to 2004:8. The difference of the results must depend on the fact that the beginning of the data is not January and the end is not December? The first two solutions give nine annual means while the last two give only eight means. The block size in the last two must be 12 months, as is said in ?aggregate, instead of a calender year that I am looking for. Gabor's first suggestion solved my problem.

Thank you,
Hannu

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Gabor Grothendieck
Sent: Thursday, September 23, 2004 3:52 PM
To: r-help at stat.math.ethz.ch
Subject: Re: [R] block statistics with POSIX classes

I am not sure that I understand what you are looking
for since you did not provide any sample data with
corresponding output to clarify your question.

Here is another guess.

If y is just a numeric vector with monthly data
and dp is a POSIXlt vector of the same length then:

aggregate(list(y=y), list(dp\$year), mean)\$y

will perform aggregation, as will

aggregate(ts(y, start=c(d\$year[1],d\$mon[1]+1), freq = 12), nfreq=1, mean)

which converts y to ts and then performs the aggregation.  The first
one will work even if y is irregular while the second one assumes that
y must be monthly.  The second one returns a ts object.

By the way, I had a look at gev's source and it seems that despite the
documentation it does not use POSIXct anywhere internally.  If the
block is "year" or other character value then it simply assumes that
whatever datetime class is used has an as.POSIXlt method.  If your dates
were POSIXct rather than POSIXlt then it would be important to ensure
that whatever timezone is assumed (which I did not check) in the conversion
is the one you are using.  You could use character dates or Date class to
avoid this problem.  Since you appear to be using POSIXlt datetimes from
the beginning I think you should be ok.

Kahra Hannu <kahra <at> mpsgr.it> writes:

:
: Thank you Petr and Gabor for the answers.
:
: They did not, however, solve my original problem. When I have a monthly time
series y with a POSIX date
: variable dp, the most obvious way to compute e.g. the annual means is to use
aggregate(y, 1, mean) that
: works with time series. However, I got stuck with the idea of using the 'by'
argument as by = dp\$year. But in
: that case y has to be a data.frame. The easiest way must be the best way.
:
: Regards,
: Hannu
:
: -----Original Message-----
: From: r-help-bounces <at> stat.math.ethz.ch
: [mailto:r-help-bounces <at> stat.math.ethz.ch]On Behalf Of Gabor Grothendieck
: Sent: Thursday, September 23, 2004 12:56 PM
: To: r-help <at> stat.math.ethz.ch
: Subject: Re: [R] block statistics with POSIX classes
:
:
: Kahra Hannu <kahra <at> mpsgr.it> writes:
:
: :
: : I have a monthly price index series x, the related return series y = diff
(log
: (x)) and a POSIXlt date-time
: : variable dp. I would like to apply annual blocks to compute for example
: annual block maxima and mean of y.
: :
: : When studying the POSIX classes, in the first stage of the learning curve,
I
: computed the maximum drawdown
: : of x:
: : > mdd <- maxdrawdown(x)
: : > max.dd <- mdd\$maxdrawdown
: : > from <- as.character(dp[mdd\$from])
: : > to <- as.character(dp[mdd\$to])
: : > from; to
: : [1] "2000-08-31"
: : [1] "2003-03-31"
: : that gives me the POSIX dates of the start and end of the period and
: suggests that I have done something correctly.
: :
: : Two questions:
: : (1) how to implement annual blocks and compute e.g. annual max, min and
mean
: of y (each year's max, min, mean)?
: : (2) how to apply POSIX variables with the 'block' argument in gev in the
: evir package?
: :
: : The S+FinMetrics function aggregateSeries does the job in that module; but
I
: do not know, how handle it in R.
: : My guess is that (1) is done by using the function aggregate, but how to
: define the 'by' argument with POSIX variables?
:
: 1. To create a ts monthly time series you specify the first month
: and a frequency of 12 like this.
:
: z.m <- ts(rep(1:6,4), start = c(2000,1), freq = 12)
: z.m
:
: # Annual aggregate is done using aggregate.ts with nfreq = 1
: z.y <- aggregate(z.m, nfreq = 1, max)
: z.y
:
: # To create a POSIXct series of times using seq
: # (This will use GMT.  Use tz="" arg to ISOdate if you want current tz.)
: z.y.times <- seq(ISOdate(2000,1,1), length = length(z.y), by = "year")
: z.y.times
:
: 2. Have not used evir but looking at ?gev it seems you can
: use block = 12 if you have monthly data and want the blocks to be
: successive 12 month periods or you can add a POSIXct times attribute to
: your data as below (also see comment re tz above) and then use
: block = "year" in your gev call.
:
: attr(z.m, "times") <- seq(ISOdate(2000,1,1), length=length(z.m), by="month")
: str(z.m)  # display z.m along with attribute info
:
: ______________________________________________
: R-help <at> stat.math.ethz.ch mailing list
: https://stat.ethz.ch/mailman/listinfo/r-help
:
: ______________________________________________
: R-help <at> stat.math.ethz.ch mailing list
: https://stat.ethz.ch/mailman/listinfo/r-help
:
:

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help