[R] SQL INSERT using RMySQL

Gregory Warnes gregory.warnes at mac.com
Sat Apr 12 20:40:20 CEST 2008


Hi All,

I figured out my problem.  There was a combination of lack of  
understanding on my part, and a bit of missing functionality.  I made  
a small patch to the rmysqlWriteTable() function passes the field  
names to MySQL corresponding to the data columns passed in:

diff -ru RMySQL.orig/R/MySQLSupport.R RMySQL/R/MySQLSupport.R
--- RMySQL.orig/R/MySQLSupport.R	2007-05-31 22:36:02.000000000 -0400
+++ RMySQL/R/MySQLSupport.R	2008-04-11 17:50:29.000000000 -0400
@@ -616,7 +616,9 @@
     on.exit(unlink(fn), add = TRUE)
     sql4 <- paste("LOAD DATA LOCAL INFILE '", fn, "'",
                    " INTO TABLE ", name,
-                  " LINES TERMINATED BY '\n' ", sep="")
+                  " LINES TERMINATED BY '\n' ",
+                  " ( ", paste(names(field.types), collapse=", "),  
");",
+                 sep="")
     rs <- try(dbSendQuery(new.con, sql4))
     if(inherits(rs, ErrorClass)){
        warning("could not load data into table")

I also defined a useful function for describing the structure of an  
existing table:


setGeneric(
            "dbDescribeTable",
            function(conn, name, ...)
              standardGeneric("dbDescribeTable"),
            valueClass = "character"
            )


setMethod(
           "dbDescribeTable",
           signature(conn="MySQLConnection", name="character"),
           def = function(conn, name, ...){
             rs <- dbGetQuery(conn, paste("describe", name))
             fields <- rs$Type
             names(fields) <- rs$Field
             if(length(fields)==0)
               fields <- character()
             fields
           },
           valueClass = "character"
           )

And I now have working code:

 > ## Columns in the table
 > dbDescribeTable(con, "past_purchases")
                 id        customer_id           item_upc
"int(10) unsigned"          "int(11)"       "bigint(12)"
          suggested           quantity              total
       "tinyint(1)"          "int(11)"          "int(11)"
            on_sale       actual_price           featured
       "tinyint(1)"           "double"       "tinyint(1)"
               date
             "date"
 >
 > ## columns in my data (note the absence of the primary key 'id')
 > head(fulldata)
   customer_id    item_upc suggested quantity total on_sale
1           3 11111111632     FALSE        1     1   FALSE
2           3 11111111733     FALSE        1     1   FALSE
3           3 11111116095     FALSE        1     1   FALSE
4           3 11111117164     FALSE        1     1   FALSE
5           3 11111117210     FALSE        1     1   FALSE
6           3 11111119092     FALSE        1     1   FALSE
   actual_price featured       date
1        10.49    FALSE 2008-03-22
2         4.99    FALSE 2008-03-22
3         5.49    FALSE 2008-03-22
4         9.99    FALSE 2008-03-22
5         4.19    FALSE 2008-03-22
6         3.99    FALSE 2008-03-22
 >
 > dim(fulldata)
[1] 75  9
 >
 >
 > ## Size of the table before adding my data
 > dbGetQuery(con, "SELECT COUNT(ID) FROM past_purchases")[1,1]
[1] 675
 >
 > ## Insert the data
 > dbWriteTable(
+              con,
+              "past_purchases",
+              value=fulldata,
+              overwrite=FALSE,
+              append=TRUE,
+              row.names=FALSE #,
+              #field.types=field.types
+              )
[1] TRUE
 >
 > ## Size of the table after adding my data
 > dbGetQuery(con, "SELECT COUNT(ID) FROM past_purchases")[1,1]
[1] 750

-Greg



On Apr 11, 2008, at 10:57PM , Chris Stubben wrote:
>
> Greg,
>
> If you have a MySQL table with an auto_increment field, you could just
> insert a NULL value into that column and the database will  
> increment the key
> (it may not work in SQL STRICT mode, I'm not sure).  I don't think  
> there's
> any way to specify which columns you want to load data into using
> dbWriteTable yet, but that would be a nice feature since LOAD data now
> allows that (and SET syntax and other options).
>
> Try this code below - I used cbind(NA, x) to insert a null into the  
> first
> column.
>
> Chris
>
>> dbSendQuery(con, "create table tmp (id int not null auto_increment  
>> primary
>> key, a char(1), b char(1))")
> <MySQLResult:(369,1,67)>
>> x<-data.frame( a=letters[1:3], b=letters[4:6])
>> x
>   a b
> 1 a d
> 2 b e
> 3 c f
>> dbWriteTable(con, "tmp", cbind(NA, x), row.name=FALSE, append=TRUE)
> [1] TRUE
>> dbWriteTable(con, "tmp", cbind(NA, x), row.name=FALSE, append=TRUE)
> [1] TRUE
>> dbReadTable(con, "tmp")
>   id a b
> 1  1 a d
> 2  2 b e
> 3  3 c f
> 4  4 a d
> 5  5 b e
> 6  6 c f
>
>
>
>
>
> Gregory. R. Warnes wrote:
>>
>> Hi All,
>>
>> I've finally gotten around to database access using R.  I'm happily
>> extracting rows from a MySQL database using RMySQL, but am having
>> problems appending rows to an existing table.
>>
>> What I *want* to do is to append rows to the table, allowing the
>> database to automatically generate primary key values.  I've only
>> managed to add rows by using
>>
>> 	dbWriteTable( con, "past_purchases", newRecords, overwrite=FALSE,
>> append=TRUE, ...)
>>
>> And this only appears to properly append rows (as opposed to
>> overwriting them) IFF
>> 1) the row names for newRecords are new unique primary key values,
>> 2) the argument row.names is TRUE.
>>
>> If row.names is FALSE, the records will not be appended, even if
>> newRecords contains a column (named 'id') of unique values that
>> corresponding to the primary key (named 'id').
>>
>> It appears that in this case, the row names on the data frame are
>> still being used for the primary key, and since overwrite is FALSE,
>> the new records are being silently dropped.
>>
>>
>> I did manage to get things working by doing the following:
>>
>> ## get the last used id value (primary key)
>> maxId <- dbGetQuery(con, "SELECT MAX(id) FROM past_purchases")[1,1]
>> maxId
>> if(is.na(maxId)) maxId <- -1
>>
>> ## add the new unique primary keys as row names
>> rownames(fulldata) <- maxId + 1:nrow(fulldata)
>>
>> ## now write out the data
>> dbWriteTable(con, "past_purchases", value=fulldata, overwrite=FALSE,
>> append=TRUE, row.names=TRUE)
>>
>>
>> Is there a better way to accomplish this task?  (Session info is  
>> below)
>>
>> Thanks!,
>>
>> -Greg
>>
>>
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/SQL-INSERT- 
> using-RMySQL-tp16640280p16644954.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list