[R] "na.strings" and the like; suspending interpretation of "NA"

Daniel Nordlund djnordlund at verizon.net
Tue Aug 4 08:45:24 CEST 2009


> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Jan 
> Theodore Galkowski
> Sent: Monday, August 03, 2009 10:23 PM
> To: R Project
> Subject: [R] "na.strings" and the like; suspending 
> interpretation of "NA"
> 
> Can someone point me to the proper place in the documentation 
> or on the
> Wiki where I can learn how to get R to stop interpreting the 
> string "NA"
> as something special?  I have a table in a database which contains
> (among other things) country codes and continent codes.  The standard
> set of two-letter codes includes "NA" to denote "North America". I
> learned of the "na.strings" parameter for RODBC's "sqlQuery", 
> being able
> to shut down this interpretation when data is read in.
> 
> However, in the program which uses this data, I (must) have some other
> instance where the "NA" gets spontaneously"interpreted as "not
> available", shows up in vectors and lists as "<NA>", and breaks
> function. I temporarily solved the problem by defining all 
> instances of
> "NA" in the database as "NAC".  It still would be good to know a
> generaly solution.  I've seen something mentioned in conjunction with
> "options", but I'm not sure what that is about.
> 
> Thanks much,
> 
>   - Jan, 
>      Akamai Technologies,
>      Cambridge, MA
> 

Jan,

If you search the help for NA, i.e.

?NA

You will see:

Details
The NA of character type is distinct from the string "NA". Programmers who
need to specify an explicit string NA should use NA_character_ rather than
"NA", or set elements to NA using is.na<-. 

So one can do the following 

> s <- 'NA'
> s
[1] "NA"
> is.na(s)
[1] FALSE

> s2 <- LETTERS[1:6]
> s2[6] <- NA
> s2
[1] "A" "B" "C" "D" "E" NA 
> is.na(s2)
[1] FALSE FALSE FALSE FALSE FALSE  TRUE
> 

Notice that in string s, the characters (NA) are surrounded by quotes, and R
returns false for is.na().  But for string s2, the missing value NA is not
surrounded by quotes and is.na() returns TRUE for s2[6].  So R itself does
not confuse "NA" with character type NA.  You will need to give more detail
about how you are using RODBC, how your original data are structured, and
where in your program values are getting converted to NA, before anyone can
give you much help.

Dan

Daniel Nordlund
Bothell, WA USA




More information about the R-help mailing list