[R-SIG-Finance] timeseries - xst vs. dataframe?

Gabor Grothendieck ggrothendieck at gmail.com
Thu Feb 14 12:39:30 CET 2008


See interspersed comments.

On Thu, Feb 14, 2008 at 2:00 AM, icosa atropa <icos.atropa at gmail.com> wrote:
> It looks like help('$') and help('is.atomic') raise some interesting
> ts and zoo implementation questions. I understand that zoo is modeled
> on ts; I didn't really appreciate until now that ts was atomic. Is an
> atomic ts and zoo "correct"? Would a recursive zoo break things? Items
> of note:
>
> * Is zoo conceptually atomic if it contains a non-integer index vector
> and an integer data vector?

zoo does not know what the mode or class of its index vector is.  It only deals
with the index vector through the methods of index vector's class that
are required to be there as defined in ?zoo .   The only exceptions
are a few routines which deal with outside world:

- read.zoo knows about certain classes so it can read them in from files
- yearmon and yearqtr are provided classes intended to parallel freq =
3 and 12 in ts
- as.Date is extended slightly to handle numeric arguments without having
  to specify the origin

>
> *zoo and ts behave more recursively than atomically under certain conditions:
> One example is dynlm. It accepts a number of recursive objects for
> "data=", including environments and dataframes. Here zoo and ts as
> atomic seem to be temporarily granted recursive, environment-like
> status so that the model can be specified using the traditional
> environment-element syntax:
>
> # one of these is not like the others
> test.matrix = cbind(a=1:5, b=2*1:5)
> test.ts = ts(cbind(a=1:5, b=2*1:5))
> test.zoo = zoo(data.frame(a=1:5, b=2*1:5), order.by=letters[1:5])
> test.df = data.frame(a=1:5, b=2*1:5, group=c('A', 'A', 'A', 'B', 'B'),
>  order.by=letters[1:5])
>
> dynlm(a ~ b, data=test.matrix)
> dynlm(a ~ b, data=test.ts)
> dynlm(a ~ b, data=test.zoo)
> dynlm(a ~ b, data=test.df)
> # the first one doesn't work, but all the others are identical. same for lm()
>

I just checked and dynlm does this

 if (!is.list(data))
            data <- as.list(data)

and as.list on a matrix does not give anything useful in this context:

Note that as.list.zoo returns a list whose components are the columns
as zoo objects whereas as.list on a matrix returns a list whose components
are the elements of the matrix.

> *One quirk, in my experience, of zoo being atomic is that different
> modes of data for the same observation time must be kept in different
> objects due to type coercion. I.e.

As mentioned above zoo does not know the mode of its index so the
mode of the data is not related to the mode of the index.  What
you are observing is due to whatever coercions exist in R and the methods
being used (which are, in general, not supplied by zoo).  zoo does
not participate.

>
> # with a dataframe, everything can pack together in one object
> subset(test.df, group=='A')
> # with zoo, separate objects with the same index are necessary
> groups.zoo = zoo(group=c('A', 'A', 'A', 'B', 'B'), order.by=letters[1:5])
> subset(test.zoo, groups.zoo=='A')
> # this doesn't work, since packing together coerces everything to character
> oops.zoo = zoo(data.frame(a=1:5, b=2*1:5, group=c('A', 'A', 'A', 'B', 'B')),
> order.by=letters[1:5])
>
> * help('ts') and help('vector') reference Becker et al. (1988), while
> help('data.frame') and help('lm') reference Chambers (1992). Is ts
> being atomic perhaps an accident of timing and history? Re-reading the
> docs for lm, data.frame, and is.atomic gives me a sense that custom
> classes tend to model recursive rather than atomic:
>
> help('is.atomic'):
>  'is.atomic' is true for the atomic vector types ('"logical"',
>  '"integer"', '"numeric"', '"complex"', '"character"' and '"raw"')
>  and 'NULL'.
>

Perhaps its because many computations on arrays are MUCH
faster in R.

>  Most types of language objects are regarded as recursive: those
>  which are not are the atomic vector types, 'NULL' and symbols (as
>  given by 'as.name').
>
> best,
> christian
>
>
> On Feb 13, 2008 7:48 AM, Jeff Ryan <jeff.a.ryan at gmail.com> wrote:
> > Hi everyone,
> >
> > I agree that the '$' operator seems like a nice addition. The only
> > problem that I see is:
> >
> > from help('$')
> > ...
> > The default methods work somewhat differently for atomic vectors,
> > matrices/arrays and for recursive (list-like, see 'is.recursive')
> > objects. '$' returns 'NULL' (with a warning) except for recursive
> > objects, and is only discussed in the section below on recursive
> > objects. Its use on non-recursive objects was deprecated in R
> > 2.5.0.
> > ...
> >
> > Since zoo (and thus xts) is really a matrix/array with attributes - it
> > seems like it has the chance of breaking something - though where I
> > can't reasonably imagine.
> >
> > The flip side to the argument against is that returning a NULL object
> > seems to be of little value to anything.
> >
> > Is anyone aware of the reason for it being deprecated for non-recursive objects?
> >
> > Jeff
> >
> >
> > On Feb 13, 2008 8:43 AM, Achim Zeileis <Achim.Zeileis at wu-wien.ac.at> wrote:
> > > On Wed, 13 Feb 2008, Gabor Grothendieck wrote:
> > >
> > > > zoo is modelled on the "ts" class, not on the "data.frame" class.
> > > > In R, the way it works is that $ is used on list-based objects
> > > > and not on array-based objects.
> > >
> > > True, and this is the explanation why we don't have it at the moment. But
> > > the $ operator might be a convenient addition. And we currently have a few
> > > examples where we are consistent with "ts" but provide further features
> > > for convenience. At the moment, I don't think something dangerous would
> > > happen if we add it - or do I overlook something?
> > >
> > > Given that you have already written the code, I would vote for including
> > > it in the package.
> > > Z
> > >
> > >
> > > > Of course, your are free to define
> > > > and redefine operators as you please and since zoo is an S3 class
> > > > its possible to add your own S3 methods. $ indexing is less
> > > > than a dozen lines of code to add to your program:
> > > >
> > > > "$.zoo" <- function(object, x) object[, x]
> > > >
> > > > "$<-.zoo" <- function(object, x, value) {
> > > > stopifnot(length(dim(object)) == 2)
> > > > if (x %in% colnames(object)) object[,x] <- value
> > > > else {
> > > > object <- cbind(object, value)
> > > > colnames(object)[ncol(object)] <- x
> > > > }
> > > > object
> > > > }
> > > >
> > > > # test
> > > > library(zoo)
> > > > z <- zoo(cbind(a = 1:3, b = 4:6))
> > > > z$c <- z$b + 1
> > > > z$a <- z$b - 1
> > > > z
> > > >
> > > > > dataframes. Is "list" syntax planned for inclusion in xst? At
> > > > > present, column numbering (test.zoo[,1]) seems the best alternative.
> > > > >
> > > > > Since as.data.frame(test.zoo)$a appears to recover the core data, it
> > > > > sounds sensible for "test.zoo$a" to extract an object containing the
> > > > > index and the named column. Does this break anything?
> > > > > e.g. :
> > > > >
> > > > > test.df = data.frame(a=1:5, b=2*(1:5))
> > > > > test.df$a
> > > > > #[1] 1 2 3 4 5
> > > > >
> > > > > index = Sys.time() + 60*1:5
> > > > > test.zoo = zoo(test.df, order.by=index)
> > > > > test.zoo$a
> > > > > #NULL
> > > > >
> > > > > test.zoo[1:2,1]
> > > > > # 2008-02-13 05:54:40 2008-02-13 05:55:40
> > > > > # 1 2
> > > > >
> > > > > coredata(test.zoo)$a
> > > > > #NULL
> > > > >
> > > > > as.data.frame(test.zoo)$a
> > > > > # [1] 1 2 3 4 5
> > > > >
> > > > > thanks and best,
> > > > > christian
> > > > >
> > > > > > At this point 'xts' objects behave much like any standard data.frame, matrix or, most closely, zoo object. They have some unique user 'xts' methods but all standard 'zoo' methods will work (it just extends 'zoo')
> > > > >
> > > >
> > > > _______________________________________________
> > > > R-SIG-Finance at stat.math.ethz.ch mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> > > > -- Subscriber-posting only.
> > > > -- If you want to post, subscribe first.
> > > >
> > > >
> > >
> > > _______________________________________________
> > > R-SIG-Finance at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> > > -- Subscriber-posting only.
> > > -- If you want to post, subscribe first.
> > >
> >
> >
> >
> > --
> > There's a way to do it better - find it.
> > Thomas A. Edison
> >
> --
> Far better an approximate answer to the right question, which is often
> vague, than the exact answer to the wrong question, which can always
> be made precise -- j.w. tukey
>



More information about the R-SIG-Finance mailing list