[R] Consistant test for NAs in a factor when exclude = NULL?

andrewH ahoerner at rprogress.org
Thu Oct 27 02:32:27 CEST 2011


Dear folks?

Is there a function to correctly find (and count) the NAs in a factor when
exclude=NULL, regardless of whether their origin is in the original data or
by subsequent assignment?

In example number 1 below, where NAs are assigned by is.na()<-, testing the
factor with is.na() finds the correct number of NAs.  In example number 2,
where the NAs are from the data, neither is.na(), ==NA, nor =="NA" correctly
identifies the NAs.  In example number 3, which mixes NAs from assignment
with NAs from data, is.na does not even find the NAs created by assignment,
as it did in example 1.

I'm running R 2.13.2 on Windows XP with ServicePack 3

Any assistance would be greatly appreciated.

Appreciatively, andrewH


Example #1

> # Origin: is.na()<-  Exclude: NULL
> KK <- factor(c("A","A","B","B","C","C"), exclude=NULL)
> KK[KK=="C"]
[1] C C
Levels: A B C
> is.na(KK[KK=="C"]) <- TRUE
> KK
[1] A    A    B    B    <NA> <NA>
Levels: A B C
> levels(KK)
[1] "A" "B" "C"
> levels(KK)[KK]
[1] "A" "A" "B" "B" NA  NA 
> KK==NA
[1] NA NA NA NA NA NA
> sum(KK==NA)
[1] NA
> KK=="NA"
[1] FALSE FALSE FALSE FALSE    NA    NA
> sum(KK=="NA")
[1] NA
> is.na(KK)
[1] FALSE FALSE FALSE FALSE  TRUE  TRUE
> sum(is.na(KK))
[1] 2

Example #2

> # Origin: data Exclude: NULL
> GG <- factor(c("A","A","B","B", NA, NA), exclude=NULL)
> GG
[1] A    A    B    B    <NA> <NA>
Levels: A B <NA>
> levels(GG)
[1] "A" "B" NA 
> levels(GG)[GG]
[1] "A" "A" "B" "B" NA  NA 
> GG==NA
[1] NA NA NA NA NA NA
> sum(GG==NA)
[1] NA
> GG=="NA"
[1] FALSE FALSE FALSE FALSE FALSE FALSE
> sum(GG=="NA")
[1] 0
> is.na(GG)
[1] FALSE FALSE FALSE FALSE FALSE FALSE
> sum(is.na(GG))

Example #3.

> MM <- factor(c("A","A","B","B","C","C", NA), exclude=NULL)
> is.na(MM[MM=="C"]) <- TRUE
> MM
[1] A    A    B    B    <NA> <NA> <NA>
Levels: A B C <NA>
> levels(MM)
[1] "A" "B" "C" NA 
> levels(MM)[MM]
[1] "A" "A" "B" "B" NA  NA  NA 
> MM==NA
[1] NA NA NA NA NA NA NA
> sum(MM==NA)
[1] NA
> MM=="NA"
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> sum(MM=="NA")
[1] 0
> is.na(MM)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
> sum(is.na(MM))
[1] 0

--
View this message in context: http://r.789695.n4.nabble.com/Consistant-test-for-NAs-in-a-factor-when-exclude-NULL-tp3942755p3942755.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list