[Rd] xyTable(x,y) versus table(x,y) with NAs
Serguei Sokol
@oko| @end|ng |rom |n@@-tou|ou@e@|r
Tue Apr 25 18:03:47 CEST 2023
Le 25/04/2023 à 17:39, Bill Dunlap a écrit :
> x <- c(1, 1, 2, 2, 2, 3)
> y <- c(1, 2, 1, 3, NA, 3)
>> str(xyTable(x,y))
> List of 3
> $ x : num [1:6] 1 1 2 2 NA 3
> $ y : num [1:6] 1 2 1 3 NA 3
> $ number: int [1:6] 1 1 1 NA NA 1
>
>
> How many (2,3)s do we have? At least one, the third entry, but the fourth
> entry, (2,NA), is possibly a (2,3) so we don't know and make the count NA.
> I suspect this is not the intended logic, but a byproduct of finding value
> changes in a sorted vector with the idiom x[-1]!=x[-length(x). Also the
> following does follow that logic:
>
>> x <- c(1, 1, 2, 2, 5, 6)
>> y <- c(2, 2, 2, 4, NA, 3)
>> str(xyTable(x,y))
> List of 3
> $ x : num [1:5] 1 2 2 5 6
> $ y : num [1:5] 2 2 4 NA 3
> $ number: int [1:5] 2 1 1 1 1
Not really. If we take
x <- c(1, 1, 2, 2, 5, 6, 5, 5)
y <- c(2, 2, 2, 4, NA, 3, 3, 4)
we get
str(xyTable(x,y))
List of 3
$ x : num [1:7] 1 2 2 5 5 NA 6
$ y : num [1:7] 2 2 4 3 4 NA 3
$ number: int [1:7] 2 1 1 1 NA NA 1
How many (5, 3) we have? At least 1 but (5, NA) is possibly (5,3) so we
should have NA but we have 1.
How many (5, 4) we have? At least 1 but (5, NA) is possibly (5,4) and we
do get NA. So restored logic is not consistent.
Without talking about a pair (NA, NA) appeared and not producing (5, NA)
pair.
Best,
Serguei.
>
>
>
> table() does not use this logic, as one NA in a vector would make all the
> counts NA. Should xyTable have a way to handle NAs the way table() does?
>
> -Bill
>
> On Tue, Apr 25, 2023 at 1:26 AM Viechtbauer, Wolfgang (NP) <
> wolfgang.viechtbauer using maastrichtuniversity.nl> wrote:
>
>> Hi all,
>>
>> Posted this many years ago (
>> https://stat.ethz.ch/pipermail/r-devel/2017-December/075224.html), but
>> either this slipped under the radar or my feeble mind is unable to
>> understand what xyTable() is doing here and nobody bothered to correct me.
>> I now stumbled again across this issue.
>>
>> x <- c(1, 1, 2, 2, 2, 3)
>> y <- c(1, 2, 1, 3, NA, 3)
>> table(x, y, useNA="always")
>> xyTable(x, y)
>>
>> Why does xyTable() report that there are NA instances of (2,3)? I could
>> understand the logic that the NA could be anything, including a 3, so the
>> $number value for (2,3) is therefore unknown, but then the same should
>> apply so (2,1), but here $number is 1, so the logic is then inconsistent.
>>
>> I stared at the xyTable code for a while and I suspect this is coming from
>> order() using na.last=TRUE by default, but in any case, to me the behavior
>> above is surprising.
>>
>> Best,
>> Wolfgang
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list