[R-sig-DB] [R] SQLite: When reading a table, a "\r" is padded onto the last column. Why?

David James d@j025 @end|ng |rom gm@||@com
Fri Jan 5 23:19:21 CET 2007


On 1/5/07, Seth Falcon <sfalcon using fhcrc.org> wrote:
> "David James" <daj025 using gmail.com> writes:
>
> > Hi,
> >
> > A word of caution:  relying on read.table() to read very large
> > files (even piecemeal) can be extremely inefficient.  The reason
> > I implemented the dbWriteTable from a file is exactly to circumvent
> > this problem.  You can upload a huge data file into the db
> > extremely fast, but going through read.table() could be considerably
> > slower.
>
> I wonder if things have changed at all in that regard.  I would expect
> read.table to be slower, but, if called carefully, not _extremely_
> slower.  We should do some timings...
>
> My concern with the .import-like code is on the maintenance
> side and in terms of providing all of the same features as read.table
> (at which point, perhaps the performance gain will diminish).

Yes, doing a full-fledge csv parser would be painful, and the current
code is fairly inflexible (quotes, locale-specific numbers, etc., etc.)

>
> > Perhaps we're considering two slightly different situations:
> > (1) export a data.frame to sqlite (here, the prepared statement/binding
> > approach is optimal).
>
> The prepared statement/binding approach is also the thing to do when
> you have a large text file that you want to import, but you need to do
> some massaging along the way (perhaps you only want parts of it,
> perhaps you want to load into two or more tables, etc).
>
> > (2) load a csv file into R or SQLite (here using the C code from .import
> > is optimal).
>
> I can see the use-case, but have some reservations.  Without the
> functionality we have now you could:
>
>  a. use SQLite directly and skip R.
>  b. use R and make a system call to the sqlite command line.

Hmm, we could, but we would have to make sure that the Windows
package includes the sqlite command-line interpreter (I seem to recall
that it only include the sqlite library).  In general, RSQLite does not use
anything but the library to ensure it's self-contained.

>
> The dbWriteTable functionality is more elegant than (a) or (b), and
> perhaps I would have fewer reservations if it was called
> sqliteImportFile() -- do the other DBI packages support file import in
> this way via dbWriteTable?

No, they don't.  I never thought about providing an import facility on top
of those already provided natively by Oracle and MySQL; the case with
SQLite is slightly different if we only consider it as an embedded engine in R.

I guess you're suggestion of moving it to sqliteImportFile() is cleaner.

--
David

>
> + seth
>




More information about the R-sig-DB mailing list