[R-sig-DB] data type mapping for RMySQL

Fri Dec 20 16:28:12 CET 2013

Just for the record, this also happens in Postgresql, so the problem is 
more general. (I tried testing in SQLite but my standalone client seems 
to be encrypting the db and RSQLite cannot use it.) I'm not sure there 
is a solution to this as I do not immediately see how to represent 
bigint in R.

Generic SQL example:

 >mysql test
 >psql  test
 >sqlite test

[ DROP TABLE tester; ]

CREATE TABLE  tester (
       id bigint NOT NULL,
       address bigint NOT NULL
     ) ;

INSERT INTO tester VALUES (1, 2029716610205351937);

SELECT * FROM tester ;

gives
   1 | 2029716610205351937
in mysql, psql, and sqlite

( exit; /q or .quit )
(later DROP TABLE tester;)

 >R

require("RMySQL")
hubdb <- dbConnect("MySQL", dbname="test" )
print(fetch(dbSendQuery(hubdb, "select * from tester;"), n=-1), digits=20)

RMySQL gives
   id             address
1  1 2029716610205351936

require("RPostgreSQL")
hubdb <- dbConnect("PostgreSQL", dbname="test" )
print(fetch(dbSendQuery(hubdb, "select * from tester;"), n=-1), digits=20)
RPostgreSQL gives
   id             address
1  1 2029716610205351936

Paul

On 13-12-20 08:07 AM, Tim Coote wrote:
> Hullo
>
> I’m not sure whether this is a bug, but I would expect some sort of warning where mappings may lose precision.
>
> I’ve got a mysql table with large numbers in it:
>
> CREATE TABLE if not exists `tester` (
>        `id` bigint(20) NOT NULL AUTO_INCREMENT,
>        `address` bigint(20) NOT NULL,
>        PRIMARY KEY (`id`),
>        UNIQUE KEY `idx_address` (`address`)
>      ) ENGINE=InnoDB AUTO_INCREMENT=78628 DEFAULT CHARSET=latin1 ;
>
> The key aspect of this is that the ‘address’ column needs to be used to distinguish between values. If I put one value into this table, and read it back with the mysql client:
>
> mysql> \. test.sql
> Query OK, 1 row affected, 1 warning (0.00 sec)
>
> Database changed
> Query OK, 0 rows affected (0.05 sec)
>
> Query OK, 1 row affected (0.01 sec)
>
> +---------------------+
> | address             |
> +---------------------+
> | 2029716610205351937 |
> +---------------------+
> 1 row in set (0.00 sec)
>
> But if I try to read the value using R:
>> library(RMySQL)
> Loading required package: DBI
>> options(digits=20)
>>
>> hubdb <- dbConnect("MySQL", username="root", host="localhost", password="dummy", dbname="dummy" )
>> q = sprintf("select * from tester;")
>>
>>
>> testvals <- fetch(dbSendQuery(hubdb, q), n=-1)
>>
>> print (testvals)
>                address
> 1 2029716610205351936
>>
>>
> Note that the address value is different
>
> The issue is that the precision of the sql type (20 decimal characters) is larger than the precision of the R type that it’s been matched to.  I don’t know whether this is a bug or not - it would have help to have a warning about the possible precision mismatch, rather than to spot the emerging bug ;-) - that aside, is there a simple way to map the bigint type to something else, eg for this situation, character would do. I know that I can cast the type in the sql query, but I’d prefer to set the mapping at a more global scope if I could in the data definition.
>
> tia
>
> Tim
> _______________________________________________
> R-sig-DB mailing list -- R Special Interest Group
> R-sig-DB using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-db
>