[R-sig-DB] [R] SQLite: When reading a table, a "\r" is padded onto the last column. Why?
David James
d@j025 @end|ng |rom gm@||@com
Fri Jan 5 23:19:21 CET 2007
On 1/5/07, Seth Falcon <sfalcon using fhcrc.org> wrote:
> "David James" <daj025 using gmail.com> writes:
>
> > Hi,
> >
> > A word of caution: relying on read.table() to read very large
> > files (even piecemeal) can be extremely inefficient. The reason
> > I implemented the dbWriteTable from a file is exactly to circumvent
> > this problem. You can upload a huge data file into the db
> > extremely fast, but going through read.table() could be considerably
> > slower.
>
> I wonder if things have changed at all in that regard. I would expect
> read.table to be slower, but, if called carefully, not _extremely_
> slower. We should do some timings...
>
> My concern with the .import-like code is on the maintenance
> side and in terms of providing all of the same features as read.table
> (at which point, perhaps the performance gain will diminish).
Yes, doing a full-fledge csv parser would be painful, and the current
code is fairly inflexible (quotes, locale-specific numbers, etc., etc.)
>
> > Perhaps we're considering two slightly different situations:
> > (1) export a data.frame to sqlite (here, the prepared statement/binding
> > approach is optimal).
>
> The prepared statement/binding approach is also the thing to do when
> you have a large text file that you want to import, but you need to do
> some massaging along the way (perhaps you only want parts of it,
> perhaps you want to load into two or more tables, etc).
>
> > (2) load a csv file into R or SQLite (here using the C code from .import
> > is optimal).
>
> I can see the use-case, but have some reservations. Without the
> functionality we have now you could:
>
> a. use SQLite directly and skip R.
> b. use R and make a system call to the sqlite command line.
Hmm, we could, but we would have to make sure that the Windows
package includes the sqlite command-line interpreter (I seem to recall
that it only include the sqlite library). In general, RSQLite does not use
anything but the library to ensure it's self-contained.
>
> The dbWriteTable functionality is more elegant than (a) or (b), and
> perhaps I would have fewer reservations if it was called
> sqliteImportFile() -- do the other DBI packages support file import in
> this way via dbWriteTable?
No, they don't. I never thought about providing an import facility on top
of those already provided natively by Oracle and MySQL; the case with
SQLite is slightly different if we only consider it as an embedded engine in R.
I guess you're suggestion of moving it to sqliteImportFile() is cleaner.
--
David
>
> + seth
>
More information about the R-sig-DB
mailing list