[R-SIG-Finance] Discretising intra-day data -- how to get by with less memory?

Gabor Grothendieck ggrothendieck at gmail.com
Fri Nov 27 13:37:03 CET 2009


What you are asking for has the potential to create huge data sets
depending on the time range of the data and N. What are they?   If
that is the problem then its not just a matter of how memory intensive
the code is but just about any manipulation will fail.  Do the
problems arise on the aggregate or moving back and forth between zoo
and ts?

On Fri, Nov 27, 2009 at 6:56 AM, Ajay Shah <ajayshah at mayin.org> wrote:
> I'm using this function to convert intra-day data into a grid with an
> observation each N seconds:
>
>  # This function consumes "z" a zoo object where timestamps are intraday
>  # and a period for discretisation Nseconds.
>  # The key ideas are from this thread:
>  #    https://stat.ethz.ch/pipermail/r-sig-finance/2009q4/005144.html
>  intraday.discretise <- function(z, Nseconds) {
>    toNsec <- function(x) as.POSIXct(Nseconds*ceiling(as.numeric(x)/Nseconds),
>                                   origin = "1970-01-01")
>    d <- aggregate(z, toNsec, tail, 1)
>    # At this point there is one problem: NA records are not created
>    # for blocks of time in which there were no records.
>
>    # To solve this:
>    dreg <- as.zoo(as.ts(d))
>    class(time(dreg)) <- class(time(d))
>
>    dreg
>  }
>
> This works correctly but it's incredibly memory-intensive. I'm running
> out of core in running this for some problems.
>
> Is there a way to write this which would use less RAM?
>
> --
> Ajay Shah                                      http://www.mayin.org/ajayshah
> ajayshah at mayin.org                             http://ajayshahblog.blogspot.com
> <*(:-? - wizard who doesn't know the answer.



More information about the R-SIG-Finance mailing list