[R] R and databases - a comment

Tony Plate tplate at acm.org
Fri Dec 9 17:05:30 CET 2005


This is very useful, thanks for posting!

I created a page for this at the R Wiki: 
http://fawn.unibw-hamburg.de/cgi-bin/Rwiki.pl?DataBases

If any one has any info to add, go at it!

-- Tony Plate

charles loboz wrote:
> 1. That was a part of a private email exchange. It has
> been suggested that more people may be interested. 
> 
> 2. I did use various databases (significant part of my
> job) for the last 15 years. Some with R for the last 3
> years as a hobby. Some comments on the ones used
> below. Sorry, no links - I am time-constrained at the
> moment - please google if interested in details. The
> remarks are from the point of view of R user, not that
> of 'general database user'.
>  
> 3. SQLITE. www.sqlite.org - probably the best datase
> to use with R. No setup, no administration, embedded -
> so less connection overhead. All data in one file - so
> easy to transfer. Solid. Very functional SQL, fast if
> you play it right (almost as fast as SQLServer on
> Windows...) . Some limitations - no stored procedures.
> Some preprocessing/parsing can be done using TCL -
> well integrated with sqlite if you need that. Due to
> the implementation quirk you can even compute
> recursive functions (like exponential moving average
> or Fibonacci numbers) with SQL :-). Easy import/export
> of data to text files. After trying few other dbs I
> settled down on this one. Even considered writing a
> tutorial on SQLite use with R (like how to process
> gigabytes of data on a 128mb computer :-) ) - but time
> constraints stopped me. [Personally I think that
> SQLite should come bundled with the standard R
> installation. Could even be used to keep a lot of R's
> internal stuff, would probably simplify overall
> coding. But that is for others to decide]
>  
> All other databases (including mysql) require typical
> setup - installation, administration, user rights,
> keeping track of ports, services/daemons, directories,
> backups etc - so some db administrative skills are
> required.I am not sure how many R users are willing to
> go through that. The ones who may be interested in the
> stuff below
>  
> 4. www.postgres.org Postgres. Free. As complete as one
> can wish, small download, great functionality.
> Interfaces well to other languages, so you can do
> numerics in C++ and store that in the database (though
> why not do numerics in R?). Current version 8.1, much
> improved. 
>  
> 5. Firebird. open source verion of Interbase. Easy
> setup and can have all data in one file. But... slow
> development - not many developers there. SQL full but
> somewhat quirky (when porting from other dialects). 
>  
> 6. Mysql. the inheritance from the original ISAM
> system still shows. Nice user interface, but... if you
> need real db why not use postgres? if you need
> something simpler, without administration, why not use
> SQLITE? No doubt mysql is fine for many simple
> websites etc - this is mysql's niche.
>  
> 7. derby and hsqldb. both are written in Java, open
> source. HSQLDB (used now by OpenOffice) allows
> creation of in-memory tables and it's fast there - but
> it's usage from inside R is tricky - there is no
> easily available, installable and current ODBC driver.
> Similar for derby - the ODBC driver is there, but
> installation can be tricky to non-professionals. May
> be in the future...
>  
> There are three 'express' versions of commercial
> databases. They all share some restrictions, like max
> disc data size 2-4gb, max mem size 1-2gb and usage of
> single processor only. Plus various licensing
> restrictions, so be careful how you use them. 
>  
>  - Microsoft - in beta now, over 100mb download
> (windows only) (the old version, MSDE, is also
> available)
>  - Oracle - 150mb download, if i remember correctly
> even free to distribute, but check the license
>  - DB2 - 500mb download, currently 90 day version, IBM
> strong rumour is that early next year the new version
> will be free. 
>  
> Each commercial DB has some OLAP capability, but I am
> not sure how much of it is/will be available in the
> Express version.
> 
> 
> 		
> __________________________________________ 
> 
> Just $16.99/mo. or less. 
> dsl.yahoo.com
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list