[R] Advantages of using SQLite for data import in comparison to csv files
Magnus Torfason
zulutime.net at gmail.com
Thu Jan 14 23:36:32 CET 2010
RSQLite is actually much more than a "driver for data import". It
basically allows any manipulation of SQLite databases, including
reading, writing, or updating and munging data within the database.
Regarding the original question of data import:
I use csv and sqlite files interchangeably to store my data, and
converting between one and the other is generally trivial (read one
format into a data.frame and then write it into the other).
For me, the key determinant is whether a given data set is so big that
reading it into a data.frame with read.csv() every time I need it is a
pain. I usually keep all my original data in csv files and write
routines to write the contents of huge files into sqlite databases.
sqldf is a great package as well, but it definitely does not obviate the
need to actually be able to manipulate data base files. For one thing,
you cannot index a csv file or a data.frame. If you have to repeatedly
select subsets of your large data set, creating an index on the relevant
column in the sqlite table is an absolute life saver.
(However, with a 1000x20 data set, you will probably not have to worry
about the speed of selecting from the table. Unless you need to use
bigger data sets, the simplest way is probably to just use csv files,
read the contents into a data.frame with read.csv, and then use sqldf on
the data.frame if you need to do complicated subsetting)
Best,
Magnus
On 1/14/2010 2:12 AM, Juliet Jacobson wrote:
> But then why has the R driver for data import from a SQLite database
> been written?
>
> Gabor Grothendieck wrote:
>> You could look at read.csv.sql in sqldf (http://sqldf.googlecode.com) as well.
More information about the R-help
mailing list