[Rd] table(exclude = NULL) always includes NA
Suharto Anggono Suharto Anggono
suharto_anggono at yahoo.com
Sun Aug 7 17:32:19 CEST 2016
This is an example from https://stat.ethz.ch/pipermail/r-help/2007-May/132573.html .
With R 2.7.2:
> a <- c(1, 1, 2, 2, NA, 3); b <- c(2, 1, 1, 1, 1, 1)
> table(a, b, exclude = NULL)
b
a 1 2
1 1 1
2 2 0
3 1 0
<NA> 1 0
With R 3.3.1:
> a <- c(1, 1, 2, 2, NA, 3); b <- c(2, 1, 1, 1, 1, 1)
> table(a, b, exclude = NULL)
b
a 1 2 <NA>
1 1 1 0
2 2 0 0
3 1 0 0
<NA> 1 0 0
> table(a, b, useNA = "ifany")
b
a 1 2
1 1 1
2 2 0
3 1 0
<NA> 1 0
> table(a, b, exclude = NULL, useNA = "ifany")
b
a 1 2 <NA>
1 1 1 0
2 2 0 0
3 1 0 0
<NA> 1 0 0
For the example, in R 3.3.1, the result of 'table' with exclude = NULL includes NA even if NA is not present. It is different from R 2.7.2, that comes from factor(exclude = NULL), that includes NA only if NA is present.
>From R 3.3.1 help on 'table', in "Details" section:
'useNA' controls if the table includes counts of 'NA' values: the allowed values correspond to never, only if the count is positive and even for zero counts. This is overridden by specifying 'exclude = NULL'.
Specifying 'exclude = NULL' overrides 'useNA' to what value? The documentation doesn't say. Looking at the code of function 'table', the value is "always".
For the example, in R 3.3.1, the result like in R 2.7.2 can be obtained with useNA = "ifany" and 'exclude' unspecified.
The result of 'summary' of a logical vector is affected. As mentioned in http://stackoverflow.com/questions/26775501/r-dropping-nas-in-logical-column-levels , in the code of function 'summary.default', for logical, table(object, exclude = NULL) is used.
With R 2.7.2:
> log <- c(NA, logical(4), NA, !logical(2), NA)
> summary(log)
Mode FALSE TRUE NA's
logical 4 2 3
> summary(log[!is.na(log)])
Mode FALSE TRUE
logical 4 2
> summary(TRUE)
Mode TRUE
logical 1
With R 3.3.1:
> log <- c(NA, logical(4), NA, !logical(2), NA)
> summary(log)
Mode FALSE TRUE NA's
logical 4 2 3
> summary(log[!is.na(log)])
Mode FALSE TRUE NA's
logical 4 2 0
> summary(TRUE)
Mode TRUE NA's
logical 1 0
In R 3.3.1, "NA's' is always in the result of 'summary' of a logical vector. It is unlike 'summary' of a numeric vector.
On the other hand, in R 3.3.1, FALSE is not in the result of 'summary' of a logical vector that doesn't contain FALSE.
I prefer the result of 'summary' of a logical vector like in R 2.7.2, or, alternatively, the result that always includes all possible values: FALSE, TRUE, NA.
More information about the R-devel
mailing list