[R] Weird Behavior of mean
Richard O'Keefe
r@oknz @end|ng |rom gm@||@com
Sat Dec 14 00:28:19 CET 2024
My preference would be for anything that is defined as taking a
"logical" parameter to report an error if given anything else.
On Sat, 14 Dec 2024 at 12:21, Ben Bolker <bbolker using gmail.com> wrote:
>
> Thanks, I had missed/forgotten the fact that there is also an
> inconsistency between mean.default() and sd().
>
> sd() calls var(), which evaluates if(na.rm) [i.e., it will try to
> coerce `na.rm` to logical rather than testing isTRUE]
>
> IM(H?)O, it would be best for both mean.default() and sd() to use
> if(isTRUE(as.logical(na.rm))) -- this converts NULL, numeric(0), zero
> numeric values, etc. to FALSE, non-zero numeric values (including
> complex numbers not equal to 0+0i) to TRUE ... fails on un-coerceable
> stuff like functions, environments ...
>
>
> ‘as.logical’ attempts to coerce its argument to be of logical
> type. In numeric and complex vectors, zeros are ‘FALSE’ and
> non-zero values are ‘TRUE’. For ‘factor’s, this uses the ‘levels’
> (labels). Like ‘as.vector’ it strips attributes including names.
> Character strings ‘c("T", "TRUE", "True", "true")’ are regarded as
> true, ‘c("F", "FALSE", "False", "false")’ as false, and all others
> as ‘NA’.
>
>
> On 2024-12-13 5:43 p.m., Bert Gunter wrote:
> > Ivo, et al.:
> > --IMHO only ... and with apologies for verbosity
> >
> > Defining, let alone enforcing, "consistent behavior" can be a
> > philosophical conundrum: what one person deems "consistent" behavior
> > for a function across different data structures and circumstances may
> > not be the same as another's. While you may consider the issue clear
> > here, a glance at the source code shows that may not necessarily be
> > the case: mean() is an S3 generic, but sd() is derived from var()
> > which is in turn based on cov(), for which NA handling is more
> > complex.
> >
> > Anyway, for me, the only defensible standard should be is that the
> > *documented* behavior for overloaded function names is that they
> > should be accurately documented for each use case, whether or not the
> > semantics conform to any particular paradigm of consistency. By this
> > standard, I think mean() is behaving correctly, as its Help page says:
> >
> > na.rm
> > a *logical* evaluating to TRUE or FALSE indicating whether NA values
> > should be stripped before the computation proceeds. [emphasis added]
> > Note: *not* a value that can be *coerced* to logical, but an actual
> > logical expression.
> >
> > But sd() is not, as its Help page says:
> > na.rm
> > logical. Should missing values be removed?
> > Note: So seemingly same as above, but as you noted, will work for
> > values that can be coerced to logical and not just actual logical
> > expressions.
> >
> > Cheers,
> > Bert
> >
> >
> >
> > On Fri, Dec 13, 2024 at 11:43 AM ivo welch <ivo.welch using ucla.edu> wrote:
> >>
> >> isn't this still a little R buglet? I have overwritten T (even if my
> >> schuld [franconian], it is not that uncommon an error, because T is also a
> >> common abbreviation for the end of a time series; namespace pollution in R
> >> can be quite annoying, even though I understand that it is convenient in
> >> interactive mode). Nevertheless, I am passing into mean() a positive
> >> number for na.rm, and by definition, a positive number still means TRUE.
> >> besides, sd() and mean() should probably treat this similarly, anyway. I
> >> do see the argument that functions cannot be proof against redefinitions of
> >> all sorts of objects that they can use. more philosophically, some
> >> variables should not be overwritable, or at least trigger a warning.
> >>
> >> As Dante wrote, Abandon all hope ye who enter R.
> >>
> >> --
> >> Ivo Welch (ivo.welch using ucla.edu)
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Dr. Benjamin Bolker
> Professor, Mathematics & Statistics and Biology, McMaster University
> Director, School of Computational Science and Engineering
> > E-mail is sent at my convenience; I don't expect replies outside of
> working hours.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list