[R] histogram question

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Nov 15 12:17:44 CET 2001


On Thu, 15 Nov 2001, Erich Neuwirth wrote:

> thanks for all the help.
>
> one question remains.
> if histogram is meant for continuous data
> (which makes sense)
> why is it changing the defaults of the graphics
> depending on the amount of data,
> and not on the scale of the data.

That's implied by the theory for the choice of histogram bandwidths ....

> in both my examples, i had a data vector with numbers ranging from 0 to
> 10,
> once with 1000 elements,
> once with 100000 elements.
>
> this is the same "quality" of data.

It's not: more data is higher quality for a density estimate (as a
histogram is).

> should the graphics defaults not stay consistent with that?

The issue is that the bandwidth has been reduced, and in the second case
you have empty histogram bins which you interpreted as gaps.


Now R follows an early version of S in using Sturges' formula.  There are
much better rules of thumb, and after seeing the account in Venables &
Ripley, S-PLUS integrated them.  Perhaps R should do so too, but you will
find them in package MASS.

> Ben Bolker wrote:
> >
> >   The basic problem is that hist() is really designed for continuous data,
> > and you're using it with discrete data.  You can either say
> >
> > r <- rbinom(100000,10,0.5)
> > hist(r,10,0.5),col=2,xlim=c(0,10),ylim=c(0,30000),
> >      breaks=seq(-0.5,10.5,by=0.1))
> >
> > so that the bins span (-0.5 to 0.5, 0.5 to 1.5, ...)
> >
> > or (arguably better, because it is more sensible with discrete data)
> >
> > barplot(table(r),space=0)
> >
> > On Mon, 12 Nov 2001, Erich Neuwirth wrote:
> >
> > > hist(rbinom(1000,10,0.5),col=2,xlim=c(0,10),ylim=c(0,300))
> > > gives a histogram with "touching bars"
> > >
> > > hist(rbinom(100000,10,0.5),col=2,xlim=c(0,10),ylim=c(0,30000))
> > > gives a histogram with space between the bars.
> > >
> > > is there a way to control the space betweent he bars easily?
>
> --
> Erich Neuwirth, Computer Supported Didactics Working Group
> Visit our SunSITE at http://sunsite.univie.ac.at
> Phone: +43-1-4277-38624 Fax: +43-1-4277-9386
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list