[R-sig-DB] NULL data not mapped to NA with RODBC on 64-bit Mac OS X

Harlan Harris h@r|@n @end|ng |rom h@rr|@@n@me
Thu Mar 31 15:35:40 CEST 2011


Prof. Ripley? Anyone? Should I cross-post this to R-Help? If someone would
be willing to fix this for pay, please email me a proposal and I'll see if
my employer would be willing to sponsor a fix.

 -Harlan

On Tue, Mar 29, 2011 at 9:43 AM, Harlan Harris <harlan using harris.name> wrote:

> Following up on this... Prof. Ripley? Is this something you can address?
>
>  -Harlan
>
>
> On Thu, Mar 24, 2011 at 9:54 AM, Harlan Harris <harlan using harris.name> wrote:
>
>> Thanks for the reply, Marc!
>>
>> Ah, it does appear I'm booting in 32-bit mode. From the terminal:
>>
>> > uname -a
>> Darwin Harlan-Harris-MacBook-Pro.local 10.6.0 Darwin Kernel Version
>> 10.6.0: Wed Nov 10 18:13:17 PST 2010; root:xnu-1504.9.26~3/RELEASE_I386 i386
>>
>> I can usually run stuff in 32-bit mode, but sometimes I would like the
>> ability to use the extra RAM. Also, certain interfaces to R (such as the new
>> RStudio) will run R in 64-bit mode, so for now, I can't use RStudio when I'm
>> pulling data via ODBC that might have NULLs. I think Rscript also wants to
>> run in 64-bit mode.
>>
>>  -Harlan
>>
>>
>> On Thu, Mar 24, 2011 at 9:09 AM, Marc Schwartz <marc_schwartz using me.com>wrote:
>>
>>> Hi Harlan,
>>>
>>> I suspect that we will need to wait for Prof. Ripley to be able to look
>>> into this further. There may be subtle issues at play here and I would defer
>>> to his more intimate knowledge of the ODBC tool chain.
>>>
>>> I do know that the Actual driver is both 32 and 64 bit in a single
>>> installation. However, I run R in 32 bit mode (don't need the extra address
>>> space, even though I now have 8Gb on my MacBook Pro), so have not
>>> encountered the 64 bit issues that you report here.
>>>
>>> I have the same header file. Note that SQLLEN is a signed declaration
>>> below. It is defined as a 'long' generally and as a INT64 for 64 bit
>>> Windows. It would be 'unsigned long' or UINT64, respectively for an unsigned
>>> value.
>>>
>>> One question for you however, which is do you know if you are booting the
>>> 32 bit or 64 bit OSX kernel on your Mac? Macs prior to mid-2010 (like mine)
>>> running Snow Leopard typically default to the 32 bit kernel, though can be
>>> 64 bit 'capable'. It depends upon various factors including the presence of
>>> a 64 bit EFI. Newer Macs now boot into the 64 bit kernel by default. I have
>>> seen some problems reported on older Macs where the default kernel was
>>> changed from 32 bit to 64 bit, resulting in subtle and not so subtle
>>> problems with some drivers. Booting to the 32 bit kernel still allows you to
>>> run 64 bit apps under Snow Leopard, just that the kernel and extensions are
>>> 32 bit.
>>>
>>> Regards,
>>>
>>> Marc
>>>
>>>
>>> On Mar 23, 2011, at 2:29 PM, Harlan Harris wrote:
>>>
>>> > Any thoughts? Is there someone else or somewhere else I might ask?
>>> >
>>> > -Harlan
>>> >
>>> > On Tue, Mar 22, 2011 at 1:31 PM, Harlan Harris <harlan using harris.name>
>>> wrote:
>>> >
>>> >> Hello,
>>> >>
>>> >> I seem to have hit a bug in RODBC on 64-bit versions of R on Mac OS X.
>>> I'm
>>> >> running R 2.12.1, and using R to talk to Oracle and other databases
>>> using
>>> >> RODBC 1.3-2. In 32-bit mode (R32), everything works. (The ODBC drivers
>>> are
>>> >> the ones from Actual.) In 64-bit mode (R), it seems as if NULL data is
>>> not
>>> >> being treated as missing. For example, a NULL value in an integer
>>> column is
>>> >> being returned as a 0 instead, which is wrong/bad.
>>> >>
>>> >> Diving into the code and Google a bit, it seems like an issue that's
>>> been
>>> >> seen before in other contexts. Here's the relevant C code, from
>>> >> RODBCFetchRows:
>>> >>
>>> >>        for(row = thisHandle->rowsUsed;
>>> >>        row < thisHandle->rowsFetched && j <= maximum;
>>> >>        j++, row++)
>>> >>        {
>>> >>        thisHandle->rowsUsed++;
>>> >>        if(j > blksize) {
>>> >>            blksize *= 2;
>>> >>            for (i = 0; i < nc; i++)
>>> >>            SET_VECTOR_ELT(data, i,
>>> >>                       lengthgets(VECTOR_ELT(data, i), blksize));
>>> >>        }
>>> >>        for (i = 0; i < nc; i++) {
>>> >>            SQLLEN len = thisHandle->ColData[i].IndPtr[row];
>>> >>            switch(thisHandle->ColData[i].DataType) {
>>> >>            case SQL_DOUBLE:
>>> >>            REAL(VECTOR_ELT(data, i))[j-1] =
>>> >>                len == SQL_NULL_DATA ? NA_REAL :
>>> >>                thisHandle->ColData[i].RData[row];
>>> >>            break;
>>> >>
>>> >> etc...
>>> >>
>>> >> Googling for SQL_NULL_DATA and related terms finds this page, about
>>> the
>>> >> same issue in a Python driver:
>>> >>
>>> >> http://code.google.com/p/pyodbc/issues/detail?id=51
>>> >>
>>> >> It appears as if SQLLEN might be unsigned in 64-bit builds, while
>>> >> SQL_NULL_DATA is -1. Casting len to be an int rather than an unsigned
>>> int
>>> >> seems to solve the problem for them.
>>> >>
>>> >> On my machine, /usr/include/sqltypes.h starts like this:
>>> >>
>>> >> /*
>>> >> *  sqltypes.h
>>> >> *
>>> >> *  $Id: sqltypes.h,v 1.23 2007/10/07 13:27:13 source Exp $
>>> >> *
>>> >> *  ODBC typedefs
>>> >> *
>>> >> *  The iODBC driver manager.
>>> >>
>>> >> and defines SQLLEN as follows:
>>> >>
>>> >> #ifdef _WIN64
>>> >> typedef INT64                   SQLLEN;
>>> >> typedef UINT64                  SQLULEN;
>>> >> typedef UINT64                  SQLSETPOSIROW;
>>> >> #elif defined(STRICT_ODBC_TYPES)
>>> >> typedef long                    SQLLEN;
>>> >> typedef unsigned long           SQLULEN;
>>> >> typedef unsigned short          SQLSETPOSIROW;
>>> >> #else
>>> >> #define SQLLEN                  long
>>> >> #define SQLULEN                 unsigned long
>>> >> #define SQLSETPOSIROW           unsigned short
>>> >> #endif
>>> >>
>>> >> Seems to me like SQLLEN should be signed, but apparently not?
>>> >>
>>> >> Can anyone help? Thank you!
>>> >>
>>> >> -Harlan
>>>
>>>
>>>
>>
>

	[[alternative HTML version deleted]]




More information about the R-sig-DB mailing list