[R] RODBC sqlQuery insert slow
Michel Lang
michellang at gmail.com
Fri Oct 13 17:19:09 CEST 2006
> I am trying to insert a lot of data into a table using windows R (2.3.1)
> and a mysql database via RODBC.
> First I read a file with read.csv and then form sql insert statements
> for each row and execute the insert query one row at a time. See the
> loop below.
> This turns out to be very slow.
> Can anyone please suggest a way to speed it up?
A few weeks ago I had to solve a similar task. Inserting each row turned out
to be horrible slow due to paste() and the data.frame-indexing. The estimated
runtime would have been over 3 weeks, so I used MySQLs LOAD DATE INFILE
syntax to speed things up. You must have FILE_PRIV = 'Y' set in the
mysql.user-table to use this small hack, and I'm not that sure that this runs
remotely. It is also assumed that your df has valid column-names.
tmp_filename <- tempfile()
write.table(df, tmp_filename, na = "\\N", row.names = FALSE, col.names =
FALSE, quote = FALSE, sep = "\t")
query <- paste(
"LOAD DATA LOCAL INFILE '", tmp_filename, "'",
" INTO TABLE ", your_table, " (", toString(names(df)), ");", sep = "")
sqlQuery(channel, query)
unlink(tmp_filename)
The total runtime for the LOAD DATA INFILE querys was something below 5
minutes, inserting 3e+6 rows with > 200 columns.
Michel Lang
More information about the R-help
mailing list