[R-SIG-Finance] time series database packages

Paul Gilbert pgilbert at bank-banque-canada.ca
Tue Nov 6 18:23:41 CET 2007


I have put a group of packages on CRAN for time series databases. The 
current versions should be considered beta, and I would appreciate 
feedback from this SIG before announcing them more broadly. (Thanks to 
Gabor Grothendieck for comments on an alpha version.)

TSdbi defines a common API which the other packages use.  TSMySQL and 
TSSQLite provide methods for MySQL and SQLite, and require  RMySQL and 
RSQLite respectively. TSpadi uses an RPC based protocol for a 
client/server connection where the server could use any database, but 
the working implementation is with Fame. (This last package is mainly 
for me to support legacy applications, but also helps test the 
generality of the interface.)

I believe it should be straight forward to implement any SQL database 
having a DBI based package, and also not difficult to implement on top 
of RODBC, though I have not tried that yet. It should also be possible 
to interface to the R fame package directly, which could provide writing 
to the database and some other features not supported by TSpadi. (If 
anyone is interested in working on any of these, please contact me for 
additional hints.)

The SQL implementations define tables necessary to put in place the back 
end database, but this might benefit from examination by someone that 
understands SQL table optimization better than I do. The current 
implementation supports annual, quarterly, monthly, semiannual, weekly, 
daily, business day, minutely, irregular data with a date, and irregular 
  data with a date and time. This may be constrained by the back end 
(e.g. Fame does not support all these types.)  My own work tends to be 
with the first three, so others have not been tested as extensively.  It 
should be relatively easy to implement other types of time series data 
in the SQL back ends (suggestions and examples?).

Series documentation is supported in a meta table, which also contains a 
lookup mechanism to determine which table has the data for a given 
series identifier. (Multilingual documentation support is not 
implemented, but should not be too difficult.)

The design also (optionally) supports vintages and panels of data (e.g.
series with the same identifier but a different release date or 
country). This feature is actively under development.

The intention is that the R time series representation can optionally be 
specified, but currently only the default is working (ts were possible 
and zoo elsewhere).

Vignette examples are provided in each of the packages. (The vignettes 
are similar, but the most complete at the moment is the TSMySQL one.)

Some possible extensions include:

- a mechanism for handling aliases for series names.

- an RODBC database plug in

- an R Postgresql database plug in

- a direct fame database plug in (Fame through TSpadi is read only)

- optionally different time series representations.

- multilingual documentation

- mechanism for signaling series updates to users

It is unlikely that I will do many of these things myself, but if anyone 
is interested in working on them I would be happy to provide some guidance.

Paul Gilbert
====================================================================================

La version française suit le texte anglais.

------------------------------------------------------------------------------------

This email may contain privileged and/or confidential information, and the Bank of
Canada does not waive any related rights. Any distribution, use, or copying of this
email or the information it contains by other than the intended recipient is
unauthorized. If you received this email in error please delete it immediately from
your system and notify the sender promptly by email that you have done so. 

------------------------------------------------------------------------------------

Le présent courriel peut contenir de l'information privilégiée ou confidentielle.
La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion,
utilisation ou copie de ce courriel ou des renseignements qu'il contient par une
personne autre que le ou les destinataires désignés est interdite. Si vous recevez
ce courriel par erreur, veuillez le supprimer immédiatement et envoyer sans délai à
l'expéditeur un message électronique pour l'aviser que vous avez éliminé de votre
ordinateur toute copie du courriel reçu.


More information about the R-SIG-Finance mailing list