[R] Questions about histograms

Bill.Venables at csiro.au Bill.Venables at csiro.au
Mon Feb 11 02:38:41 CET 2008


Andre,

Regarding your first question, it is by no means clear there is anything
to fix, in fact I'm sure there is nothing to fix.  The fact that the
height of any bar is greater than one is irrelevant - the width of the
bar is much less than one, as is the product of height by width.  Area
is height x width, not just height....

Regarding the second question - logarithmic breaks.  I'm not aware of
anything currently available to do this, but the tools are there for you
to do it yourself.  The 'breaks' argument to hist allows you to specify
your breaks explicitly (among other things) so it's just a matter of
setting up the logarithmic (or, more precisely, 'geometric progression')
bins yourself and relaying them on to hist.

 


Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:                         +61 4 8819 4402
Home Phone:                     +61 7 3286 7700
mailto:Bill.Venables at csiro.au
http://www.cmis.csiro.au/bill.venables/ 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Andre Nathan
Sent: Monday, 11 February 2008 11:14 AM
To: r-help at r-project.org
Subject: [R] Questions about histograms

Hello

I'm doing some experiments with the various histogram functions and I
have a two questions about the "prob" option and binning.

First, here's a simple plot of my data using the default hist()
function:

> hist(data[,1], prob = TRUE, xlim = c(0, 35))

  http://go.sneakymustard.com/tmp/hist.jpg

My first question is regarding the resulting plot from hist.scott() and
hist.FD(), from the MASS package. I'm setting prob to TRUE in these
functions, but as it can be seen in the images below, the value for the
first bar of the histogram is well above 1.0. Shouldn't the total area
be 1.0 in the case of prob = TRUE?

> hist.scott(data[,1], prob = TRUE, xlim=c(0, 35))

  http://go.sneakymustard.com/tmp/scott.jpg

> hist.FD(data[,1], prob = TRUE, xlim=c(0, 35))

  http://go.sneakymustard.com/tmp/FD.jpg

Is there anything I can do to "fix" these plots?

My second question is related to binning. Is there a function or package
that allows one to use logarithmic binning in R, that is, create bins
such that the length of a bin is a multiple of the length of the one
before it?

Pointers to the appropriate docs are welcome, I've been searching for
this and couldn't find any info.

Best regards,
Andre

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list