[R-sig-DB] NULL data not mapped to NA with RODBC on 64-bit Mac OS X

Harlan Harris h@r|@n @end|ng |rom h@rr|@@n@me
Tue Mar 29 15:43:31 CEST 2011


Following up on this... Prof. Ripley? Is this something you can address?

 -Harlan

On Thu, Mar 24, 2011 at 9:54 AM, Harlan Harris <harlan using harris.name> wrote:

> Thanks for the reply, Marc!
>
> Ah, it does appear I'm booting in 32-bit mode. From the terminal:
>
> > uname -a
> Darwin Harlan-Harris-MacBook-Pro.local 10.6.0 Darwin Kernel Version 10.6.0:
> Wed Nov 10 18:13:17 PST 2010; root:xnu-1504.9.26~3/RELEASE_I386 i386
>
> I can usually run stuff in 32-bit mode, but sometimes I would like the
> ability to use the extra RAM. Also, certain interfaces to R (such as the new
> RStudio) will run R in 64-bit mode, so for now, I can't use RStudio when I'm
> pulling data via ODBC that might have NULLs. I think Rscript also wants to
> run in 64-bit mode.
>
>  -Harlan
>
>
> On Thu, Mar 24, 2011 at 9:09 AM, Marc Schwartz <marc_schwartz using me.com>wrote:
>
>> Hi Harlan,
>>
>> I suspect that we will need to wait for Prof. Ripley to be able to look
>> into this further. There may be subtle issues at play here and I would defer
>> to his more intimate knowledge of the ODBC tool chain.
>>
>> I do know that the Actual driver is both 32 and 64 bit in a single
>> installation. However, I run R in 32 bit mode (don't need the extra address
>> space, even though I now have 8Gb on my MacBook Pro), so have not
>> encountered the 64 bit issues that you report here.
>>
>> I have the same header file. Note that SQLLEN is a signed declaration
>> below. It is defined as a 'long' generally and as a INT64 for 64 bit
>> Windows. It would be 'unsigned long' or UINT64, respectively for an unsigned
>> value.
>>
>> One question for you however, which is do you know if you are booting the
>> 32 bit or 64 bit OSX kernel on your Mac? Macs prior to mid-2010 (like mine)
>> running Snow Leopard typically default to the 32 bit kernel, though can be
>> 64 bit 'capable'. It depends upon various factors including the presence of
>> a 64 bit EFI. Newer Macs now boot into the 64 bit kernel by default. I have
>> seen some problems reported on older Macs where the default kernel was
>> changed from 32 bit to 64 bit, resulting in subtle and not so subtle
>> problems with some drivers. Booting to the 32 bit kernel still allows you to
>> run 64 bit apps under Snow Leopard, just that the kernel and extensions are
>> 32 bit.
>>
>> Regards,
>>
>> Marc
>>
>>
>> On Mar 23, 2011, at 2:29 PM, Harlan Harris wrote:
>>
>> > Any thoughts? Is there someone else or somewhere else I might ask?
>> >
>> > -Harlan
>> >
>> > On Tue, Mar 22, 2011 at 1:31 PM, Harlan Harris <harlan using harris.name>
>> wrote:
>> >
>> >> Hello,
>> >>
>> >> I seem to have hit a bug in RODBC on 64-bit versions of R on Mac OS X.
>> I'm
>> >> running R 2.12.1, and using R to talk to Oracle and other databases
>> using
>> >> RODBC 1.3-2. In 32-bit mode (R32), everything works. (The ODBC drivers
>> are
>> >> the ones from Actual.) In 64-bit mode (R), it seems as if NULL data is
>> not
>> >> being treated as missing. For example, a NULL value in an integer
>> column is
>> >> being returned as a 0 instead, which is wrong/bad.
>> >>
>> >> Diving into the code and Google a bit, it seems like an issue that's
>> been
>> >> seen before in other contexts. Here's the relevant C code, from
>> >> RODBCFetchRows:
>> >>
>> >>        for(row = thisHandle->rowsUsed;
>> >>        row < thisHandle->rowsFetched && j <= maximum;
>> >>        j++, row++)
>> >>        {
>> >>        thisHandle->rowsUsed++;
>> >>        if(j > blksize) {
>> >>            blksize *= 2;
>> >>            for (i = 0; i < nc; i++)
>> >>            SET_VECTOR_ELT(data, i,
>> >>                       lengthgets(VECTOR_ELT(data, i), blksize));
>> >>        }
>> >>        for (i = 0; i < nc; i++) {
>> >>            SQLLEN len = thisHandle->ColData[i].IndPtr[row];
>> >>            switch(thisHandle->ColData[i].DataType) {
>> >>            case SQL_DOUBLE:
>> >>            REAL(VECTOR_ELT(data, i))[j-1] =
>> >>                len == SQL_NULL_DATA ? NA_REAL :
>> >>                thisHandle->ColData[i].RData[row];
>> >>            break;
>> >>
>> >> etc...
>> >>
>> >> Googling for SQL_NULL_DATA and related terms finds this page, about the
>> >> same issue in a Python driver:
>> >>
>> >> http://code.google.com/p/pyodbc/issues/detail?id=51
>> >>
>> >> It appears as if SQLLEN might be unsigned in 64-bit builds, while
>> >> SQL_NULL_DATA is -1. Casting len to be an int rather than an unsigned
>> int
>> >> seems to solve the problem for them.
>> >>
>> >> On my machine, /usr/include/sqltypes.h starts like this:
>> >>
>> >> /*
>> >> *  sqltypes.h
>> >> *
>> >> *  $Id: sqltypes.h,v 1.23 2007/10/07 13:27:13 source Exp $
>> >> *
>> >> *  ODBC typedefs
>> >> *
>> >> *  The iODBC driver manager.
>> >>
>> >> and defines SQLLEN as follows:
>> >>
>> >> #ifdef _WIN64
>> >> typedef INT64                   SQLLEN;
>> >> typedef UINT64                  SQLULEN;
>> >> typedef UINT64                  SQLSETPOSIROW;
>> >> #elif defined(STRICT_ODBC_TYPES)
>> >> typedef long                    SQLLEN;
>> >> typedef unsigned long           SQLULEN;
>> >> typedef unsigned short          SQLSETPOSIROW;
>> >> #else
>> >> #define SQLLEN                  long
>> >> #define SQLULEN                 unsigned long
>> >> #define SQLSETPOSIROW           unsigned short
>> >> #endif
>> >>
>> >> Seems to me like SQLLEN should be signed, but apparently not?
>> >>
>> >> Can anyone help? Thank you!
>> >>
>> >> -Harlan
>>
>>
>>
>

	[[alternative HTML version deleted]]




More information about the R-sig-DB mailing list