[Rd] 1.4.0: mean/sum of logicals
Prof Brian D Ripley
ripley@stats.ox.ac.uk
Sat, 6 Oct 2001 09:32:50 +0100 (BST)
On 5 Oct 2001, Peter Dalgaard BSA wrote:
> Torsten Hothorn <Torsten.Hothorn@rzmail.uni-erlangen.de> writes:
>
> > the NEWS file in 1.4.0-devel states:
> >
> > o mean() has `data frame' method applying mean column-by-column.
> > When applied to non-numeric data mean() now returns NA rather
> > than a confusing error message (for compatibility with S4).
> >
> >
> > which means:
> >
> > R> mean(c(TRUE, FALSE))
> > [1] NA
> > Warning message:
> > argument is not numeric: returning NA in: mean.default(c(TRUE, FALSE))
> >
> > but:
> >
> > R> sum(c(TRUE, FALSE))
> > [1] 1
> >
> > ?sum states:
> >
> > sum(..., na.rm=FALSE)
> >
> > Arguments:
> >
> > ...: numeric vectors.
> >
> > and clearly
> >
> > R> is.numeric(c(TRUE, FALSE))
> > [1] FALSE
> >
> >
> > this is confusing, isn't it? I think that `sum' and `mean' should take the
> > same arguments (and one probably will not allow to sum up logicals) or am
> > I missing something?
> >
> > Torsten
>
> Hmm. That slipped in without me noticing. Summing logicals is a fairly
> common practice, as in
>
> sem <- sd(x,na.rm=TRUE)/sqrt(sum(!is.na(x)))
>
> Taking means of logicals is somewhat more rare, but it does work in
> Splus 6.0 and it is a general rule coerce to logicals to 0/1, so I
> suspect that this is just an oversight and we want mean.default to
> start with
>
> if (!is.numeric(x) && !is.complex(x) && !is.logical(x)) {
> warning("argument is not numeric: returning NA")
> return(as.numeric(NA))
>
> If Brian really meant otherwise, he'll explain why when he gets back
> from Switzerland...
Back-compatibility. In data frames logicals used to get coerced to
two-level factors, not to numerics, and then mean would fail for them.
If a logical is an experimental factor, sums make sense but means do not.
Here is what I find confusing (1.3.1).
> x <- c(TRUE, FALSE)
> mean(x)
[1] 0.5
> DF <- data.frame(x = x)
> mean(DF$x)
Error in Summary.factor(..., na.rm = na.rm) :
"sum" not meaningful for factors
There 1.4.0 is consistent.
Change it if you like, but do think through all the implications of the
inconsistent treatment of logicals (for which I have not had time as yet).
--
Brian D. Ripley, ripley@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._