[Rd] zapsmall(x) for scalar x

Sun Dec 17 06:11:31 CET 2023

Zapping a vector of small numbers to zero would cause problems when
printing the results of summary(). For example, if
zapsmall(c(2.220446e-16, ..., 2.220446e-16)) == c(0, ..., 0) then
print(summary(2.220446e-16), digits = 7) would print
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
        0          0            0           0           0          0

The same problem can also appear when printing the results of
summary.glm() with show.residuals = TRUE if there's little dispersion
in the residuals.

Steve

On Sat, 16 Dec 2023 at 17:34, Gregory Warnes <greg using warnes.net> wrote:
>
> I was quite suprised to discover that applying `zapsmall` to a scalar value has no apparent effect.  For example:
>
> > y <- 2.220446e-16
> > zapsmall(y,)
> [1] 2.2204e-16
>
> I was expecting zapsmall(x)` to act like
>
> > round(y, digits=getOption('digits'))
> [1] 0
>
> Looking at the current source code, indicates that `zapsmall` is expecting a vector:
>
> zapsmall <-
> function (x, digits = getOption("digits"))
> {
>     if (length(digits) == 0L)
>         stop("invalid 'digits'")
>     if (all(ina <- is.na(x)))
>         return(x)
>     mx <- max(abs(x[!ina]))
>     round(x, digits = if (mx > 0) max(0L, digits - as.numeric(log10(mx))) else digits)
> }
>
> If `x` is a non-zero scalar, zapsmall will never perform rounding.
>
> The man page simply states:
> zapsmall determines a digits argument dr for calling round(x, digits = dr) such that values close to zero (compared with the maximal absolute value) are ‘zapped’, i.e., replaced by 0.
>
> and doesn’t provide any details about how ‘close to zero’ is defined.
>
> Perhaps handling the special when `x` is a scalar (or only contains a single non-NA value)  would make sense:
>
> zapsmall <-
> function (x, digits = getOption("digits"))
> {
>     if (length(digits) == 0L)
>         stop("invalid 'digits'")
>     if (all(ina <- is.na(x)))
>         return(x)
>     mx <- max(abs(x[!ina]))
>     round(x, digits = if (mx > 0 && (length(x)-sum(ina))>1 ) max(0L, digits - as.numeric(log10(mx))) else digits)
> }
>
> Yielding:
>
> > y <- 2.220446e-16
> > zapsmall(y)
> [1] 0
>
> Another edge case would be when all of the non-na values are the same:
>
> > y <- 2.220446e-16
> > zapsmall(c(y,y))
> [1] 2.220446e-16 2.220446e-16
>
> Thoughts?
>
>
> Gregory R. Warnes, Ph.D.
> greg using warnes.net
> Eternity is a long time, take a friend!
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel