[R-SIG-Finance] R-SIG-Finance Digest, Vol 40, Issue 16
icosa atropa
icos.atropa at gmail.com
Sun Sep 23 02:45:23 CEST 2007
I settled on Postgresql to store my timeseries prior to R processing,
and I've been very pleased with my choice. It has extensive datatype
support - In particular, I've used POSIX , float, text, and even GIS
extensions with great success. Postgresl documentation is __very__
good. Admin requires some time and resources (not VPS friendly by
default), but it's been a very satisfying learning experience for me.
I consider postgresql the "designers" choice of database to interface
with R. It supports cool features like foreign keys and views that I
now take for granted. As previously mentioned, server-side use of
perl and/or python functions defined and triggered within the database
can be very useful. Overall, if the product is to be used long-term,
by more than one person, or is developed by someone who enjoys
programming, then I consider Postgreql a very good choice.
I use RODBC to connect R to postgresql (extra datatype coersion can
occur here). The unixODBC docs (dongle between RODBC and postgresql)
are decent and generally complete. I preprocess my data with Perl -
the perl database interface docs are high-quality; i imagine python is
similar.
Some interesting advantages of postgresql (also apply to mysql?) that
I've found:
I can easily interface with different front ends
- php and html
- Openoffice base
- psql, etc
A final note - Relational databases don't "understand" timeseries per
se, and are not guaranteed to return rows in a given order unless
specifically requested. Adding a "unique" constraint on the time
index column and sorting on the time index gives the table the
appearance of being a timeseries. As far as the database is
concerned, however, its just a heap of tuples. Hence the importance
of pre-processing text data with a text-friendly language like python
or perl before db insertion.
hope this helps,
christian
> paul sorenson wrote:
> > I am collecting hundreds of time series and storing them in sqlite. I
> > am downloading and storing with python then reading them out with R.
>
> Just for the archive: yes MySQL, PostgreSQL, Oracle, DB2, etc can store
> data types better than sqlite. As Paul notes, the main reason to use
> sqlite is that you don't need to actually administer a real database
> environment.
>
> - Brian
More information about the R-SIG-Finance
mailing list