[Rd] xyTable(x,y) versus table(x,y) with NAs
Bill Dunlap
Tue Apr 25 17:39:27 CEST 2023
x <- c(1, 1, 2, 2, 2, 3)
y <- c(1, 2, 1, 3, NA, 3)
> str(xyTable(x,y))
List of 3
$ x : num [1:6] 1 1 2 2 NA 3
$ y : num [1:6] 1 2 1 3 NA 3
$ number: int [1:6] 1 1 1 NA NA 1
How many (2,3)s do we have? At least one, the third entry, but the fourth
entry, (2,NA), is possibly a (2,3) so we don't know and make the count NA.
I suspect this is not the intended logic, but a byproduct of finding value
changes in a sorted vector with the idiom x[-1]!=x[-length(x). Also the
following does follow that logic:
> x <- c(1, 1, 2, 2, 5, 6)
> y <- c(2, 2, 2, 4, NA, 3)
> str(xyTable(x,y))
List of 3
$ x : num [1:5] 1 2 2 5 6
$ y : num [1:5] 2 2 4 NA 3
$ number: int [1:5] 2 1 1 1 1
table() does not use this logic, as one NA in a vector would make all the
counts NA. Should xyTable have a way to handle NAs the way table() does?
-Bill
On Tue, Apr 25, 2023 at 1:26 AM Viechtbauer, Wolfgang (NP)
wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
> Hi all,
>
> Posted this many years ago (
> https://stat.ethz.ch/pipermail/r-devel/2017-December/075224.html), but
> either this slipped under the radar or my feeble mind is unable to
> understand what xyTable() is doing here and nobody bothered to correct me.
> I now stumbled again across this issue.
>
> x <- c(1, 1, 2, 2, 2, 3)
> y <- c(1, 2, 1, 3, NA, 3)
> table(x, y, useNA="always")
> xyTable(x, y)
>
> Why does xyTable() report that there are NA instances of (2,3)? I could
> understand the logic that the NA could be anything, including a 3, so the
> $number value for (2,3) is therefore unknown, but then the same should
> apply so (2,1), but here $number is 1, so the logic is then inconsistent.
>
> I stared at the xyTable code for a while and I suspect this is coming from
> order() using na.last=TRUE by default, but in any case, to me the behavior
> above is surprising.
>
> Best,
> Wolfgang
>
