[R] SQL INSERT using RMySQL
Gregory Warnes
gregory.warnes at mac.com
Sat Apr 12 20:40:20 CEST 2008
Hi All,
I figured out my problem. There was a combination of lack of
understanding on my part, and a bit of missing functionality. I made
a small patch to the rmysqlWriteTable() function passes the field
names to MySQL corresponding to the data columns passed in:
diff -ru RMySQL.orig/R/MySQLSupport.R RMySQL/R/MySQLSupport.R
--- RMySQL.orig/R/MySQLSupport.R 2007-05-31 22:36:02.000000000 -0400
+++ RMySQL/R/MySQLSupport.R 2008-04-11 17:50:29.000000000 -0400
@@ -616,7 +616,9 @@
on.exit(unlink(fn), add = TRUE)
sql4 <- paste("LOAD DATA LOCAL INFILE '", fn, "'",
" INTO TABLE ", name,
- " LINES TERMINATED BY '\n' ", sep="")
+ " LINES TERMINATED BY '\n' ",
+ " ( ", paste(names(field.types), collapse=", "),
");",
+ sep="")
rs <- try(dbSendQuery(new.con, sql4))
if(inherits(rs, ErrorClass)){
warning("could not load data into table")
I also defined a useful function for describing the structure of an
existing table:
setGeneric(
"dbDescribeTable",
function(conn, name, ...)
standardGeneric("dbDescribeTable"),
valueClass = "character"
)
setMethod(
"dbDescribeTable",
signature(conn="MySQLConnection", name="character"),
def = function(conn, name, ...){
rs <- dbGetQuery(conn, paste("describe", name))
fields <- rs$Type
names(fields) <- rs$Field
if(length(fields)==0)
fields <- character()
fields
},
valueClass = "character"
)
And I now have working code:
> ## Columns in the table
> dbDescribeTable(con, "past_purchases")
id customer_id item_upc
"int(10) unsigned" "int(11)" "bigint(12)"
suggested quantity total
"tinyint(1)" "int(11)" "int(11)"
on_sale actual_price featured
"tinyint(1)" "double" "tinyint(1)"
date
"date"
>
> ## columns in my data (note the absence of the primary key 'id')
> head(fulldata)
customer_id item_upc suggested quantity total on_sale
1 3 11111111632 FALSE 1 1 FALSE
2 3 11111111733 FALSE 1 1 FALSE
3 3 11111116095 FALSE 1 1 FALSE
4 3 11111117164 FALSE 1 1 FALSE
5 3 11111117210 FALSE 1 1 FALSE
6 3 11111119092 FALSE 1 1 FALSE
actual_price featured date
1 10.49 FALSE 2008-03-22
2 4.99 FALSE 2008-03-22
3 5.49 FALSE 2008-03-22
4 9.99 FALSE 2008-03-22
5 4.19 FALSE 2008-03-22
6 3.99 FALSE 2008-03-22
>
> dim(fulldata)
[1] 75 9
>
>
> ## Size of the table before adding my data
> dbGetQuery(con, "SELECT COUNT(ID) FROM past_purchases")[1,1]
[1] 675
>
> ## Insert the data
> dbWriteTable(
+ con,
+ "past_purchases",
+ value=fulldata,
+ overwrite=FALSE,
+ append=TRUE,
+ row.names=FALSE #,
+ #field.types=field.types
+ )
[1] TRUE
>
> ## Size of the table after adding my data
> dbGetQuery(con, "SELECT COUNT(ID) FROM past_purchases")[1,1]
[1] 750
-Greg
On Apr 11, 2008, at 10:57PM , Chris Stubben wrote:
>
> Greg,
>
> If you have a MySQL table with an auto_increment field, you could just
> insert a NULL value into that column and the database will
> increment the key
> (it may not work in SQL STRICT mode, I'm not sure). I don't think
> there's
> any way to specify which columns you want to load data into using
> dbWriteTable yet, but that would be a nice feature since LOAD data now
> allows that (and SET syntax and other options).
>
> Try this code below - I used cbind(NA, x) to insert a null into the
> first
> column.
>
> Chris
>
>> dbSendQuery(con, "create table tmp (id int not null auto_increment
>> primary
>> key, a char(1), b char(1))")
> <MySQLResult:(369,1,67)>
>> x<-data.frame( a=letters[1:3], b=letters[4:6])
>> x
> a b
> 1 a d
> 2 b e
> 3 c f
>> dbWriteTable(con, "tmp", cbind(NA, x), row.name=FALSE, append=TRUE)
> [1] TRUE
>> dbWriteTable(con, "tmp", cbind(NA, x), row.name=FALSE, append=TRUE)
> [1] TRUE
>> dbReadTable(con, "tmp")
> id a b
> 1 1 a d
> 2 2 b e
> 3 3 c f
> 4 4 a d
> 5 5 b e
> 6 6 c f
>
>
>
>
>
> Gregory. R. Warnes wrote:
>>
>> Hi All,
>>
>> I've finally gotten around to database access using R. I'm happily
>> extracting rows from a MySQL database using RMySQL, but am having
>> problems appending rows to an existing table.
>>
>> What I *want* to do is to append rows to the table, allowing the
>> database to automatically generate primary key values. I've only
>> managed to add rows by using
>>
>> dbWriteTable( con, "past_purchases", newRecords, overwrite=FALSE,
>> append=TRUE, ...)
>>
>> And this only appears to properly append rows (as opposed to
>> overwriting them) IFF
>> 1) the row names for newRecords are new unique primary key values,
>> 2) the argument row.names is TRUE.
>>
>> If row.names is FALSE, the records will not be appended, even if
>> newRecords contains a column (named 'id') of unique values that
>> corresponding to the primary key (named 'id').
>>
>> It appears that in this case, the row names on the data frame are
>> still being used for the primary key, and since overwrite is FALSE,
>> the new records are being silently dropped.
>>
>>
>> I did manage to get things working by doing the following:
>>
>> ## get the last used id value (primary key)
>> maxId <- dbGetQuery(con, "SELECT MAX(id) FROM past_purchases")[1,1]
>> maxId
>> if(is.na(maxId)) maxId <- -1
>>
>> ## add the new unique primary keys as row names
>> rownames(fulldata) <- maxId + 1:nrow(fulldata)
>>
>> ## now write out the data
>> dbWriteTable(con, "past_purchases", value=fulldata, overwrite=FALSE,
>> append=TRUE, row.names=TRUE)
>>
>>
>> Is there a better way to accomplish this task? (Session info is
>> below)
>>
>> Thanks!,
>>
>> -Greg
>>
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/SQL-INSERT-
> using-RMySQL-tp16640280p16644954.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list