[R-SIG-Finance] Discretising intra-day data -- how to get by with less memory?

Gabor Grothendieck ggrothendieck at gmail.com
Fri Nov 27 20:18:51 CET 2009


One other thing to consider is whether you could use zooreg instead of
zoo and in that case you might not need to fill in the gaps:

> z <- zoo(c(11, 13, 14), c(1, 3, 4))
> zz <- as.zooreg(z)
> zz
 1  3  4
11 13 14
> lag(zz)
 0  2  3
11 13 14

Note how time 3 was lagged to become time 2 even though the original
series had no time 2.

On Fri, Nov 27, 2009 at 12:27 PM, Ajay Shah <ajayshah at mayin.org> wrote:
> On Fri, Nov 27, 2009 at 07:37:03AM -0500, Gabor Grothendieck wrote:
>> What you are asking for has the potential to create huge data sets
>> depending on the time range of the data and N. What are they?   If
>> that is the problem then its not just a matter of how memory intensive
>> the code is but just about any manipulation will fail.  Do the
>> problems arise on the aggregate or moving back and forth between zoo
>> and ts?
>
> The object I'm dealing with has 13,667,891 rows and a lot of columns.
>
> I thought it might make sense to:
>
>  thinz <- z[,1]
>  figure out the row numbers for the aggregate(blah, tail, 1) operation in thinz
>  discretised <- z[therownums,]
>
> So instead of doing an aggregate(blah,tail,1), we'd analyse thinz and
> come up with an integer vector therownums, and use that to make the
> discretised object.
>
> This would be memory efficient since thinz has only one column.
>
> --
> Ajay Shah                                      http://www.mayin.org/ajayshah
> ajayshah at mayin.org                             http://ajayshahblog.blogspot.com
> <*(:-? - wizard who doesn't know the answer.
>



More information about the R-SIG-Finance mailing list