[Rd] hist(..., log="y")
David Scott
d@co036 @end|ng |rom uo@@@uck|@nd@@c@nz
Mon Aug 7 13:09:21 CEST 2023
Log histograms are of particular interest when dealing with heavy tailed
data/distributions.
It is not just a matter of using a log scale on the y axis though
because the base line of the histogram is at zero and the log of zero is
minus infinity.
I have implemented a version of a log histogram in the function logHist,
in my package DistributionUtils, which may be of interest if anyone
seriously wishes to add functionality to the base hist function.
David Scott
On 7/08/2023 8:54 pm, Martin Maechler wrote:
> >>>>> Ott Toomet
> >>>>> on Sat, 5 Aug 2023 23:49:38 -0700 writes:
>
> > Sorry if this topic has been discussed earlier.
>
> > Currently, hist(..., log="y") fails with
>
> >> hist(rexp(1000, 1), log="y")
> > Warning messages: 1: In plot.window(xlim, ylim, "", ...) :
> > nonfinite axis=2 limits [GScale(-inf,2.59218,..);
> > log=TRUE] -- corrected now 2: In title(main = main, sub =
> > sub, xlab = xlab, ylab = ylab, ...) : "log" is not a
> > graphical parameter 3: In axis(1, ...) : "log" is not a
> > graphical parameter 4: In axis(2, at = yt, ...) : "log" is
> > not a graphical parameter
>
> > The same applies for log="x"
>
> [...........]
>
> > This applies for the current svn version of R, and also a
> > few recent published versions. This is unfortunate for
> > two reasons:
>
> > * the error message is not quite correct--"log" is a
> > graphical parameter, but "hist" does not support it.
>
> No, not if you use R's (or S's before that) definition:
>
> graphical parameters := {the possible argument of par()}
>
> log is *not* among these.
>
>
> > * for various kinds of data it is worthwhile to make
> > histograms in log scale. "hist" is a very nice and
> > convenient function and support for log scale would be
> > handy here.
>
> Yes, possibly (see below).
> Note that the above are not errors, but warnings,
> and there *is* some support, e.g.,
>
> > set.seed(1); range(x <- rlnorm(1111))
> [1] 0.04938796 45.16293285
> > hx <- hist(x, log="x", xlim=c(0.049, 47))
> Warning messages:
> 1: In title(main = main, sub = sub, xlab = xlab, ylab = ylab, ...) :
> "log" is not a graphical parameter
> 2: In axis(1, ...) : "log" is not a graphical parameter
> 3: In axis(2, at = yt, ...) : "log" is not a graphical parameter
>
> > str(hx)
> List of 6
> $ breaks : num [1:11] 0 5 10 15 20 25 30 35 40 45 ...
> $ counts : int [1:10] 1041 58 10 0 1 0 0 0 0 1
> $ density : num [1:10] 0.1874 0.01044 0.0018 0 0.00018 ...
> $ mids : num [1:10] 2.5 7.5 12.5 17.5 22.5 27.5 32.5 37.5 42.5 47.5
> $ xname : chr "x"
> $ equidist: logi TRUE
> - attr(*, "class")= chr "histogram"
>
> where we see that it *does* plot ... but crucially not the very first bin,
> because log(0) == -Inf, with over 90% (viz. 1041) counts.
>
> > I also played a little with the code, and it seems to be
> > very easy to implement. I am happy to make a patch if the
> > team thinks it is worth pursuing.
>
> > Cheers, Ott
>
> Yeah.. and that's is the important question.
>
> Most statisticians know that a histogram is a pretty bad
> density estimator (notably if the natural density has an
> infinite support) compared to simple kernel density estimates,
> e.g. those by density().
> Hence, I'd argue that if you expect enough sophistication from
> your "viewer"s to understand a log-scale histogram, I'd say you
> should use a density with log="x" and or "y" and I I have
> successfully done so several times: It *does* work
> {particularly nicely if you use my sfsmisc::eaxis() for the log axis/es}.
>
> But you (and others) may have more good arguments why hist()
> should work with log="x" and/or log="y"...
>
> Also if your patch relatively small, its usefulness may
> outweigh the added complexity (and its long-term maintenance !).
>
> Martin
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> <https://stat.ethz.ch/mailman/listinfo/r-devel>
--
_________________________________________________________________
David Scott
Department of Statistics
The University of Auckland, PB 92019
Auckland 1142, NEW ZEALAND
Email:d.scott using auckland.ac.nz
[[alternative HTML version deleted]]
More information about the R-devel
mailing list