[Rd] Problem with table
Terry Therneau
therneau at mayo.edu
Tue Mar 27 15:12:08 CEST 2012
On 03/27/2012 02:05 AM, Prof Brian Ripley wrote:
> n 19/03/2012 17:01, Terry Therneau wrote:
>> R version 2.14.0, started with --vanilla
>>
>> > table(c(1,2,3,4,NA), exclude=2, useNA='ifany')
>> 1 3 4 <NA>
>> 1 1 1 2
>>
>> This came from a local user who wanted to remove one particular response
>> from some tables, but also wants to have NA always reported for data
>> checking purposes.
>> I don't think the above is what anyone would want.
>
> You have not told us what you want!
Want: that the resulting table exclude values of "2" from the printout,
while still reporting NA. This is what the local user expected, the one
who came to me with their query.
There are lots of ways to get the program to do the right thing, the
simplest is
table(c(1,2,3,4,NA), exclude=2) # keeping the default for useNA
You show another below.
>
> Try
>
> > table(as.factor(c(1,2,3,4,NA)), exclude=2, useNA='ifany')
>
> 1 3 4 <NA>
> 1 1 1 1
>
> Note carefully how 'exclude' is defined:
>
> exclude: levels to remove from all factors in ‘...’. If set to ‘NULL’,
> it implies ‘useNA="always"’.
>
> As you did not specify a factor, 'exclude' was used in forming the
> 'levels'.
>
That is almost a "legal loophole" reading of the manual. I would never
have seen through to that level of subtlety. A primary reason is that a
simple test shows that exclude works on non-factors.
I'm not sure what the best course of action is. What I've reported is a
case where use of the options in a fairly obvious way gives an
unexpected answer. On the other hand, I have never before seen or
considered the case where someone wanted to exclude an actual data level
from table: I myself would always have removed a column from the
result. If fixing this causes other problems, then perhaps we just
give up on this rare case.
As to our local choices, we figured out a way to make display of NA the
default without causing the above problem. As is often the case, a
fairly simple solution became obvious to us about 30 minutes after
submitting a question to the list.
Terry T.
More information about the R-devel
mailing list