[Rd] Error message when calling t.test() and aov() with a factor variables

Viechtbauer, Wolfgang (NP) wo||g@ng@v|echtb@uer @end|ng |rom m@@@tr|chtun|ver@|ty@n|
Fri Oct 11 15:51:36 CEST 2024


> -----Original Message-----
> From: Kurt Hornik <Kurt.Hornik using wu.ac.at>
> Sent: Friday, October 11, 2024 14:18
> To: Viechtbauer, Wolfgang (NP) <wolfgang.viechtbauer using maastrichtuniversity.nl>
> Cc: r-devel <r-devel using r-project.org>
> Subject: Re: [Rd] Error message when calling t.test() and aov() with a factor
> variables
>
> >>>>> Viechtbauer, Wolfgang (NP) writes:
>
> > Hi all,
> > Just noticed that the error that arises when calling t.test() with factors
> could be a bit clearer:
>
> >> t.test(factor(c(3,1,2,4,3,5,4,5)), factor(c(2,1,2,3,4,5)))
> > Error in var(x) : Calling var(x) on a factor x is defunct.
> >   Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
> > In addition: Warning message:
> > In mean.default(x) :
> >   argument is not numeric or logical: returning NA
>
> > Obviously, using factors as input is nonsense, but this might happen on
> accident and then the error message could be a bit more on point. Similar for
> aov():
>
> >> aov(factor(c(3,1,2,4,3,5,4,5)) ~ factor(c(2,1,2,2,2,1,2,1)))
> > Call:
> >    aov(formula = factor(c(3, 1, 2, 4, 3, 5, 4, 5)) ~ factor(c(2,
> >     1, 2, 2, 2, 1, 2, 1)))
> > Error in levels(x)[x] :
> >   only 0's may be mixed with negative subscripts
> > In addition: Warning messages:
> > 1: In model.response(mf, "numeric") :
> >   using type = "numeric" with a factor response will be ignored
> > 2: In Ops.factor(y, z$residuals) : '-' not meaningful for factors
>
> > Not a big deal and trying to catch all of the silly things users may
> > do is of course impossible, but for this one adding a check that the
> > (response) variable is actually numeric could be useful.
>
> Indeed.
>
> As always, the question is whether we want to give an error unless
> is.numeric, or ensure via as.numeric?

I would issue an error. Using as.numeric() could lead to totally nonsensical results, for example when:

as.numeric(factor(c("low","high","mid")))

> Best
> -k
>
> > Best,
> > Wolfgang



More information about the R-devel mailing list