[R-sig-DB] [R] SQLite: When reading a table, a "\r" is padded onto the last column. Why?

Seth Falcon @|@|con @end|ng |rom |hcrc@org
Sat Jan 6 00:42:58 CET 2007


Prof Brian Ripley <ripley using stats.ox.ac.uk> writes:
> I would be surprised if read.table used carefully took a significant
> part of the time of a total analysis.  I hesitate to do timings
> without knowing the sort of table you are discussing: does it have
> many columns or many rows or both, and what variable types?
>
> Having some real-life examples to think about would be very helpful
> (as it would be for some of the efficiency issues we have been working
> on with data frames).

We've been working with annotation data for Affymetrix Mapping arrays
(SNP chips).  This translates to many rows (6M) and a handful of
columns (6-10) with a mix of integer, double, and character columns.

As you wrote, if one needs to do any manipulation of the data before
loading into the DB, then read.table will most likely not be the
bottleneck.

>>  a. use SQLite directly and skip R.
>>  b. use R and make a system call to the sqlite command line.
>
> Or
>    c. Send a suitable SQL query from the R package. I've done that in
>       RMySQL and RODBC in the past (e.g. using LOAD DATA INFILE in
>       MySQL).

Unfortunately, AFAIK, SQLite does not provide a SQL syntax to achieve
that.  I think it did at one time and then it went away with one of
the newer versions.

+ seth




More information about the R-sig-DB mailing list