[R-SIG-Finance] R + HDF5 + Pytables

Daniel Cegiełka daniel.cegielka at gmail.com
Mon May 17 16:51:42 CEST 2010


Hi Monoj
I tested hdf5 with R and in my opinion there is no sense to use it
with xts/zoo for tick data.
If you will work with R, then much better is to store xts objects (or
R objects) directly on the disk (it's simpler, faster and better way).

Check (Jeff Ryan) packages:
RBerkeley: https://r-forge.r-project.org/projects/rberkeley/
indexing: http://r-forge.r-project.org/projects/indexing/

example for RBerkeley:

bdb <- db_create()
db_open(bdb,file='blotter.db')   # load db_file from disc

# and some quary
unserialize(db_get(dbh,key='GOOG'))['2010-02-17::2010-02-25',4])


If you need ultra fast solution, you must try Jeff's indexing package ;)

regards,
daniel




2010/5/17 Manoj <manojsw at gmail.com>
>
> Dear All,
>       I have created a HDF5 file using Python + Pytables. The HDF5
> file stores tick-data and as such is quite huge in size. I am planning
> to use R/zoo/xts combination for analytics. The tricky bit is that I
> am unable to find a good wrapper to access/query the HDF5 created by
> Pytables (keeping intact all the nice features such as indices etc of
> HDF5 file) .  The hdf5 library in R wouldn't help given the size of
> the file.
>
>      One (crude) option is to query data using Python/Pytables, write
> to an output file and invoke R for analytics. The question is - could
> this task be done in a more efficient fashion? Is there a good
> HDF5/Pytables wrapper that could help me do the task completely within
> R?
>
>     Any tips/suggestions would be greatly appreciated.
>
> Thanks.
>
> Manoj
>
> _______________________________________________
> R-SIG-Finance at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
> -- Subscriber-posting only. If you want to post, subscribe first.
> -- Also note that this is not the r-help list where general R questions should go.



More information about the R-SIG-Finance mailing list