ts and defaults

Martyn Plummer plummer@iarc.fr
Mon, 09 Aug 1999 10:02:11 +0200 (CEST)


I think the c(base, offset) notation is appropriate when
1) Frequency is an integer, and
2) The time dimension refers to an interval.
whereas the scalar representation of time is appropriate when the time
dimension refers to a precise point in time.  There is no unambiguous
way of translating between the two since they are not measuring the same
thing, and it is a bit unfortunate that the two are mixed up.

The c(base,offset) notation is impossible to interpret without
reference to the frequency. So whereas
[1] 2000    1
refers to the whole of 2000 if frequency=1, it means the first 1/12
of 2000 (i.e. January) if frequency=12. This is another source of confusion.

I note in passing that the R base library allows the c(base,offset) notation
for non-integer frequency, e.g.

R> x_ts(1:10,start=c(2,1), freq=2.5)
R> start(x)
[1] 2 1

which is disallowed by S-PLUS.

Martyn


On 06-Aug-99 Prof Brian D Ripley wrote:
> On Fri, 6 Aug 1999, Paul Gilbert wrote:
> 
>> Brian
>> 
>> The example below may be helpful to further illustrate the concern I have
>> with
>> the effects of your ts library on defaults. According to the "Blue Book", in
>> Splus, in R up to 0.64.2, and in today's snapshot of 0.65 without the "ts"
>> library attached
>> 
>> > end(ts(rnorm(10), start=c(1991,1), frequency=1))
>> 
>> gives
>> [1] 2000    1
> 
> How is anyone supposed to know that means 2000 not 2001 without reading
> the help page?
> 
>> However, with the "ts" library attached the result is
>> 
>> [1] 2000
> 
> Isn't that much more readable?
> 
>> This change in the default behavior will break a lot of user code [ ... ]
> 
> Statistical evidence, please.
> 
> Note that all the S (and R) code is written to accept start=1991, so this
> user code must depart from that convention. Maybe it would be a good
> opportunity to re-write that code to conform to the standard conventions?
> As in
> 
> start:    starting date for the series, e.g., in years, February,
>        1970  would  be  1970+(1/12)  or  1970.083.  If start is a
>        vector with  at  least  two  data  values,  the  first  is
>        interpreted  as  the  time  unit,  e.g., the year, and the
>        second as  the  number  of  positions  into  the  sampling
>        period;  e.g., February, 1970 could be c(1970,2).
> 
> Note the two possibilities, and that the one you advocate is the
> alternate. I at least find c(1970, 2) = 1970+(1/12) a _very_ confusing
> notation, fortunately one that can be avoided.
> 
> The problem here is that ts in R is fundamentally incompatible with S, as R
> has a ts class and S does not. I believe start.default should not be 
> implemented in terms of class ts (and any user can change the
> behaviour of a class), and I trust the `lots of user code' does not do
> things like that. The plan is to separate the the default methods from the
> ts class methods.  We do not consider that S compatibility applies 
> to class ts, which S does not have.
> 
> Brian
> 
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._