[R-sig-DB] NULL data not mapped to NA with RODBC on 64-bit Mac OS X

Tomoaki NISHIYAMA tomo@k|n @end|ng |rom kenroku@k@n@z@w@-u@@c@jp
Tue Apr 19 03:20:30 CEST 2011


Hi Harlan,

I don't think this is a right way to fix.
Because
(int64_t) 0xFFFFFFFF == 4294967295 != -1
the bug resides on the code setting
ColData[i].IndPtr[row].
Changing the way to interpret it is a wrong solution. 
Though, it might work as a workaround for the bug in other parts.

> The commented-out warning() in the code was used to confirm that without the casts,

You did see only the lower 32 bits?
-- 
Tomoaki NISHIYAMA

Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi, 
Kanazawa, 920-0934, Japan


On 2011/04/19, at 1:31, Harlan Harris wrote:

> OK, with the help of a rusty C coder at work, I've confirmed that the solution the Python crew figured out also seems to work for me. I have no idea if it breaks the 32-bit MacOS build, or any other build. Note that this seems to be a very subtle effect of the way the compiler and header files are interacting, as far as I can tell. The commented-out warning() in the code was used to confirm that without the casts, the len variable and SQL_NULL_DATA are, in fact, the same, and that despite their being the same, the equality test fails.
> 
> A patch is attached. All of the len variables are cast to ints. (Note that casting to long, which should work, doesn't.) Would someone please apply to patch to the source tree and do some regression testing? 
> 
> -Harlan
> 
> 
> On Thu, Mar 31, 2011 at 9:35 AM, Harlan Harris <harlan using harris.name> wrote:
> Prof. Ripley? Anyone? Should I cross-post this to R-Help? If someone would be willing to fix this for pay, please email me a proposal and I'll see if my employer would be willing to sponsor a fix.
> 
>  -Harlan
> 
> 
> On Tue, Mar 29, 2011 at 9:43 AM, Harlan Harris <harlan using harris.name> wrote:
> Following up on this... Prof. Ripley? Is this something you can address? 
> 
>  -Harlan
> 
> 
> On Thu, Mar 24, 2011 at 9:54 AM, Harlan Harris <harlan using harris.name> wrote:
> Thanks for the reply, Marc!
> 
> Ah, it does appear I'm booting in 32-bit mode. From the terminal:
> 
> > uname -a
> Darwin Harlan-Harris-MacBook-Pro.local 10.6.0 Darwin Kernel Version 10.6.0: Wed Nov 10 18:13:17 PST 2010; root:xnu-1504.9.26~3/RELEASE_I386 i386
> 
> I can usually run stuff in 32-bit mode, but sometimes I would like the ability to use the extra RAM. Also, certain interfaces to R (such as the new RStudio) will run R in 64-bit mode, so for now, I can't use RStudio when I'm pulling data via ODBC that might have NULLs. I think Rscript also wants to run in 64-bit mode.
> 
>  -Harlan
> 
> 
> On Thu, Mar 24, 2011 at 9:09 AM, Marc Schwartz <marc_schwartz using me.com> wrote:
> Hi Harlan,
> 
> I suspect that we will need to wait for Prof. Ripley to be able to look into this further. There may be subtle issues at play here and I would defer to his more intimate knowledge of the ODBC tool chain.
> 
> I do know that the Actual driver is both 32 and 64 bit in a single installation. However, I run R in 32 bit mode (don't need the extra address space, even though I now have 8Gb on my MacBook Pro), so have not encountered the 64 bit issues that you report here.
> 
> I have the same header file. Note that SQLLEN is a signed declaration below. It is defined as a 'long' generally and as a INT64 for 64 bit Windows. It would be 'unsigned long' or UINT64, respectively for an unsigned value.
> 
> One question for you however, which is do you know if you are booting the 32 bit or 64 bit OSX kernel on your Mac? Macs prior to mid-2010 (like mine) running Snow Leopard typically default to the 32 bit kernel, though can be 64 bit 'capable'. It depends upon various factors including the presence of a 64 bit EFI. Newer Macs now boot into the 64 bit kernel by default. I have seen some problems reported on older Macs where the default kernel was changed from 32 bit to 64 bit, resulting in subtle and not so subtle problems with some drivers. Booting to the 32 bit kernel still allows you to run 64 bit apps under Snow Leopard, just that the kernel and extensions are 32 bit.
> 
> Regards,
> 
> Marc
> 
> 
> On Mar 23, 2011, at 2:29 PM, Harlan Harris wrote:
> 
> > Any thoughts? Is there someone else or somewhere else I might ask?
> >
> > -Harlan
> >
> > On Tue, Mar 22, 2011 at 1:31 PM, Harlan Harris <harlan using harris.name> wrote:
> >
> >> Hello,
> >>
> >> I seem to have hit a bug in RODBC on 64-bit versions of R on Mac OS X. I'm
> >> running R 2.12.1, and using R to talk to Oracle and other databases using
> >> RODBC 1.3-2. In 32-bit mode (R32), everything works. (The ODBC drivers are
> >> the ones from Actual.) In 64-bit mode (R), it seems as if NULL data is not
> >> being treated as missing. For example, a NULL value in an integer column is
> >> being returned as a 0 instead, which is wrong/bad.
> >>
> >> Diving into the code and Google a bit, it seems like an issue that's been
> >> seen before in other contexts. Here's the relevant C code, from
> >> RODBCFetchRows:
> >>
> >>        for(row = thisHandle->rowsUsed;
> >>        row < thisHandle->rowsFetched && j <= maximum;
> >>        j++, row++)
> >>        {
> >>        thisHandle->rowsUsed++;
> >>        if(j > blksize) {
> >>            blksize *= 2;
> >>            for (i = 0; i < nc; i++)
> >>            SET_VECTOR_ELT(data, i,
> >>                       lengthgets(VECTOR_ELT(data, i), blksize));
> >>        }
> >>        for (i = 0; i < nc; i++) {
> >>            SQLLEN len = thisHandle->ColData[i].IndPtr[row];
> >>            switch(thisHandle->ColData[i].DataType) {
> >>            case SQL_DOUBLE:
> >>            REAL(VECTOR_ELT(data, i))[j-1] =
> >>                len == SQL_NULL_DATA ? NA_REAL :
> >>                thisHandle->ColData[i].RData[row];
> >>            break;
> >>
> >> etc...
> >>
> >> Googling for SQL_NULL_DATA and related terms finds this page, about the
> >> same issue in a Python driver:
> >>
> >> http://code.google.com/p/pyodbc/issues/detail?id=51
> >>
> >> It appears as if SQLLEN might be unsigned in 64-bit builds, while
> >> SQL_NULL_DATA is -1. Casting len to be an int rather than an unsigned int
> >> seems to solve the problem for them.
> >>
> >> On my machine, /usr/include/sqltypes.h starts like this:
> >>
> >> /*
> >> *  sqltypes.h
> >> *
> >> *  $Id: sqltypes.h,v 1.23 2007/10/07 13:27:13 source Exp $
> >> *
> >> *  ODBC typedefs
> >> *
> >> *  The iODBC driver manager.
> >>
> >> and defines SQLLEN as follows:
> >>
> >> #ifdef _WIN64
> >> typedef INT64                   SQLLEN;
> >> typedef UINT64                  SQLULEN;
> >> typedef UINT64                  SQLSETPOSIROW;
> >> #elif defined(STRICT_ODBC_TYPES)
> >> typedef long                    SQLLEN;
> >> typedef unsigned long           SQLULEN;
> >> typedef unsigned short          SQLSETPOSIROW;
> >> #else
> >> #define SQLLEN                  long
> >> #define SQLULEN                 unsigned long
> >> #define SQLSETPOSIROW           unsigned short
> >> #endif
> >>
> >> Seems to me like SQLLEN should be signed, but apparently not?
> >>
> >> Can anyone help? Thank you!
> >>
> >> -Harlan
> 
> 
> 
> 
> 
> 
> <RODBC.c.diff>_______________________________________________
> R-sig-DB mailing list -- R Special Interest Group
> R-sig-DB using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-db


	[[alternative HTML version deleted]]




More information about the R-sig-DB mailing list