[R] RODBC and NULL values
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Oct 17 12:18:00 CEST 2006
What sqltype(s) are your variables?
For numeric types, RODBC merely maps values the ODBC driver says are NULL
to NA. Since you appear not to have character data,
nullstring: character string to be used when reading 'SQL_NULL_DATA'
character items from the database.
na.strings: character string(s) to be mapped to 'NA' when reading
character data.
are not relevant to you.
At least on Windows and Linux the PostgreSQL 8.1 ODBC driver works
correctly, and NULLs in numeric columns are mapped to NAs in R. (There is
an example in my test suite.)
On Tue, 17 Oct 2006, Mark Wardle wrote:
> Dear All,
>
> Writing sooner than I thought I'd need to.
>
> I'm using R 2.4 on Mac OS X, with RODBC, PostgreSQL 8.1 and Actual's
> ODBC driver. I have all my data in Filemaker 8.5, but it is
> automatically exported into PostgreSQL for analysis as Filemaker's ODBC
> and JDBC access is awful, slow and has a tendency to crash.
>
> I have disability data where for each patient there is a survival time
> in years from disease onset to a particular disease stage, namely
> unilateral support, bilateral support, wheelchair use, and death. Valid
> values may include NULL (patient hasn't reached that stage), 0 (for
> example, patient needed support immediately at disease onset), and any
> positive integer.
>
> When I query the database manually using psql, it is clear there are
> NULL values.
> 3 | 3 | 18 | | 27 | 1
> | | | | 13 | 1
> 1 | 5 | | | 10 | 0
> 10 | 13 | 13 | | 22 | 0
No, it is not clear. It is clear that there are values which are printed
as blank or empty strings.
> However, these are all converted to zeros when I use RODBC's sqlQuery(),
> making interpretation impossible. I have tried using the nullstring and
> na.strings options, but these don't seem to have any effect. I have
> tried various combinations of NULL, NA and "". Forgive my awkward SQL.
>
>> channel = odbcConnect("ataxia", uid="mark")
>> disease = sqlQuery(channel, "select calc_survival_unilateral_support
> as unlateral, calc_survival_bilateral_support as bilateral,
> calc_survival_wheelchair as wheelchair,calc_survival_death as death,
> calc_follow_up as followup, has_family_history_ataxia as familial from
> clinical, patient where clinical.patient_fk = patient_id and excluded=0
> and calc_walking_disability_valid=1")
>> disease # and show results
>
> 127 3 3 18 0 27 1
> 128 0 0 0 0 13 1
> 129 1 5 0 0 10 0
> 130 10 13 13 0 22 0
>
> It doesn't seem to be the old repeating rows NULL bug talked about <a
> href="http://tolstoy.newcastle.edu.au/R/help/04/07/0803.html">here</a>.
That was about R 1.9.1, about a problem solved long before then. Let's
not drag up ancient history ....
> Is this because my ODBC driver is not returning the correct values for
> RODBC to parse? Is there anyway of debugging this (the intricacies of
> ODBC are beyond my skill) and is my only alternative to store a
> non-valid number in the database (999?) and use my query or R to remove
> those datapoints afterwards?
Find out what the types involved are. Perhaps try as.is=FALSE?
> Looking in the archives, there are lots of people asking about how to
> convert NAs to numeric, but I want the NAs passed through unaltered!
Since the mapping of NULLs to NAs works in other examples, I find it hard
to see how this can be an RODBC issue.
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list