[R] histogram first bar wrong position

Martin Maechler maechler at stat.math.ethz.ch
Thu Dec 22 17:19:14 CET 2016


>>>>> itpro  <itpro1 at yandex.ru>
>>>>>     on Thu, 22 Dec 2016 16:17:28 +0300 writes:

    > Hi, everyone.
    > I stumbled upon weird histogram behaviour.

    > Consider this "dice emulator":
    > Step 1: Generate uniform random array x of size N.
    > Step 2: Multiply each item by six and round to next bigger integer to get numbers 1 to 6.
    > Step 3: Plot histogram.

    >> x<-runif(N)
    >> y<-ceiling(x*6)
    >> hist(y,freq=TRUE, col='orange')


    > Now what I get with N=100000

    >> x<-runif(100000)
    >> y<-ceiling(x*6)
    >> hist(y,freq=TRUE, col='green')

    > At first glance looks OK.

    > Now try N=100

    >> x<-runif(100)
    >> y<-ceiling(x*6)
    >> hist(y,freq=TRUE, col='red')

    > Now first bar is not where it should be.
    > Hmm. Look again to 100000 histogram... First bar is not where I want it, it's only less striking due to narrow bars.

    > So, first bar is always in wrong position. How do I fix it to make perfectly spaced bars?

Don't use histograms *at all* for such discrete integer data.

 N <- rpois(100, 5)
 plot(table(N), lwd = 4)

Histograms should be only be used for continuous data (or discrete data
with "many" possible values).

It's a pain to see them so often "misused" for data like the 'N' above.

Martin Maechler,
ETH Zurich



More information about the R-help mailing list