[R-SIG-Finance] how to grow XTS series in R dynamically ? And Quickly!

Daniel Cegiełka d@n|e|@ceg|e|k@ @end|ng |rom gm@||@com
Fri Sep 6 21:07:47 CEST 2019



> Wiadomość napisana przez Vladimir Morozov <vmorozov2006 using gmail.com> w dniu 06.09.2019, o godz. 20:04:
> 
> Hi Daniel
> Thanks a lot.
> Those are very helpful ideas.
> 
> rbind_append --> it still has to allocate memory for the resulting series... so if memory allocation was the main reason for slow performance, maybe rbind_append doesn't change much? what do you think?
> 
> preallocating regular-interval time-series is a good idea.
> however financial data are irregularly spaced (sometimes there may not be any price updates for a few secs, even more).
> even if we postulate that prices are allowed to change no more than once per second, there's a lot of uses for the frequency of price updates, not only the values of the prices (the simplest assumption is the poisson arrival process for the updates, but there are many fancier, more powerful models...)
> so, pre-allocating a regularly spaced 1-sec interval xts series dumbs down many things!
> 

let's start with what exactly do you want to do? Do you want to collect market data and save it to disk? Or maybe you want to have a real-time strategy? These are two different problems and require distinct solutions.

1) market data storage - why do you want to use R here? Isn't it better to dump the memory using mmap syscall and then import it into the database or R?

2) real-time market strategy in R - in this case your lookback is limited. So if you add new data point, you can also discard/drop the oldest. In this way, your memory usage will remain at the same low level. If this solution suits you, then you can write a fast function in C here that would operate on the xts object.

There is no such thing as matrix in R - this is a multidimensional vector. Let's say we have classic OHLC data for xts object:

O H L C
O H L C
O H L C
O H L C
O H L C


In the memory of the data looks like one long vector.

x: OOOOOHHHHHLLLLLCCCCC

You can be clever here and use memcpy():

memcpy(&xp + 1, &xp, (nrows(x) - 1)) * sizeof(double));  // or int - use: switch((TYPEOF(x))

memcpy(&index_p + 1, &index_p, (nrows(x) - 1)) * sizeof(double)); // or int for Date() type

This will move the memory so that the oldest value will be overwritten:

    1 2 3 4 5   1 2 3 4 5   1 2 3 4 5   1 2 3 4 5
x: OOOOH   HHHHL      LLLLC      CCCC N

Then you can add a new index and value.

You will have preallocated memory at all times and you will use memory copy as little as possible. And the most important: you'll be operating on the xts object all time, so your code in R will be very fast :)

It is advanced solutions - you need to understand not only how R's internals works, but also have a good C skills. If you want to use R for real-time trading, it's worth learn these things.

Daniel





> i wish i could pre-allocate the vector for the values and maybe indices, but then do the assignment of the sort:
> (say, in C++ i would have a method)
>     price.set_next_point(time, value);
> 
> thanks!
> 
> On Sat, Sep 7, 2019 at 12:08 AM Daniel Cegiełka <daniel.cegielka using gmail.com <mailto:daniel.cegielka using gmail.com>> wrote:
> 
> 
> > Wiadomość napisana przez Daniel Cegiełka <daniel.cegielka using gmail.com <mailto:daniel.cegielka using gmail.com>> w dniu 06.09.2019, o godz. 16:10:
> >
> 
> >
> > 2) preallocation
> >
> > preallocate_matrix <- function(n)
> > {
> >     x <- matrix()
> >     length(x) <- 4 * n      # bid, ask, bid_size, ask_size
> >     dim(x) <- c(n, 4)       # see: ?dim
> >     return(x)
> > }
> >
> > > x <- preallocate_matrix(5)
> > > x
> >      [,1] [,2] [,3] [,4]
> > [1,]   NA   NA   NA   NA
> > [2,]   NA   NA   NA   NA
> > [3,]   NA   NA   NA   NA
> > [4,]   NA   NA   NA   NA
> > [5,]   NA   NA   NA   NA
> 
> ?matrix
> 
> Usage
> matrix(data = NA, nrow = 1, ncol = 1, byrow = FALSE,
>        dimnames = NULL)
> 
> so we don't even need preallocate_matrix() function
> 
> > x <- .xts(matrix(nrow = 5, ncol = 4), index = Sys.time() + 1:5)
> > x
>                     [,1] [,2] [,3] [,4]
> 2019-09-06 17:07:27   NA   NA   NA   NA
> 2019-09-06 17:07:28   NA   NA   NA   NA
> 2019-09-06 17:07:29   NA   NA   NA   NA
> 2019-09-06 17:07:30   NA   NA   NA   NA
> 2019-09-06 17:07:31   NA   NA   NA   NA
> 
> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20190906/c8bcda47/attachment.html>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Message signed with OpenPGP
URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20190906/c8bcda47/attachment.sig>


More information about the R-SIG-Finance mailing list