[R] RODBC and NULL values

Mark Wardle mark at wardle.org
Tue Oct 17 12:49:06 CEST 2006


Prof Brian Ripley wrote:
> What sqltype(s) are your variables?
> 

The variables are all numeric.

> For numeric types, RODBC merely maps values the ODBC driver says are
> NULL to NA.  Since you appear not to have character data,
> 
> nullstring: character string to be used when reading 'SQL_NULL_DATA'
>           character items from the database.
> 
> na.strings: character string(s) to be mapped to 'NA' when reading
>           character data.
> 
> are not relevant to you.

I thought that, but was grasping at straws because at that point I
didn't know whether it was problem with the ODBC driver misinforming
RODBC about the correct character types.


> 
> At least on Windows and Linux the PostgreSQL 8.1 ODBC driver works
> correctly, and NULLs in numeric columns are mapped to NAs in R.  (There
> is an example in my test suite.)

I'm using Actual's ODBC driver. In my previous email, I did a test with
another ODBC client (Microsoft Excel/Query) and found it too was
misinterpreting NULL values as zero, concluding it was an issue with the
ODBC driver itself. However, I was wrong - using the iodbctest program,
the ODBC driver *is* successfully returning NULLs. It is only Microsoft
Excel/Query and R that I am having the problem with these empty
spaces/NULL characters being converted to zeros.

>> ...
>> When I query the database manually using psql, it is clear there are
>> NULL values.
>>        3 |         3 |         18 |       |       27 |        1
>>          |           |            |       |       13 |        1
>>        1 |         5 |            |       |       10 |        0
>>       10 |        13 |         13 |       |       22 |        0
> 
> No, it is not clear.  It is clear that there are values which are
> printed as blank or empty strings.
> 

I *think* postgresql is regarding them as NULL values. I don't know
whether this proves it? [The first two must be functionally equivalent)

ataxia=#select count(calc_survival_bilateral_support) from clinical;
 count
-------
    53
(1 row)

ataxia=#select count(calc_survival_bilateral_support) from clinical
where calc_survival_bilateral_support is NOT NULL;
 count
-------
    53
(1 row)


ataxia=# select count(*) from clinical;
 count
-------
   140
(1 row)


> Find out what the types involved are.  Perhaps try as.is=FALSE?
> 
Have done, and I'm afraid it doesn't change anything.

> Since the mapping of NULLs to NAs works in other examples, I find it
> hard to see how this can be an RODBC issue.
> 
Perhaps it is a peculiarity in my set-up, or I'm missing something
obvious and making some assumption somewhere. I will retrace my steps!
Perhaps I should use a different approach, but I always have difficulty
giving up on a problem unsolved!

-- 
Dr. Mark Wardle
Clinical research fellow and Specialist Registrar in Neurology,
C2-B2 link, Cardiff University, Heath Park, CARDIFF, CF14 4XN. UK



More information about the R-help mailing list