[R] stacking histograms

Liaw, Andy andy_liaw at merck.com
Tue Oct 28 03:49:31 CET 2003


The hist() function expects to be given data, not the counts in the bins.
It sounded like you are giving hist() the counts.

One thing you may try is by constructing an object of class "histogram" by
hand (see the "Value" section of ?hist), and just plot() it.  However,
beware that by default hist() tries to create a true density; i.e., the
total area of the bars should sum to one.  If you just plot the counts and
the bins do not have contant width, then your plot will look a bit strange.

For "stacking", your barplot() approach is probably easiest, but again be
careful how you read the resulting graph.

HTH,
Andy

> -----Original Message-----
> From: Rajarshi Guha [mailto:rxg218 at psu.edu] 
> Sent: Monday, October 27, 2003 9:33 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] stacking histograms
> 
> 
> Hi,
>   I have a set of observations which are divided into two 
> sets A and B. I have some code that bins the dataset into 10 
> bins based on the max and min of the observed values. 
> 
> I would like to make a histogram of A & B using my calculated 
> bins but plot the distribution of B on top of A (like a 
> stacked barplot). This is possible since both sets A & B are 
> binned using the same bin ranges.
> 
> I have my data in the format:
> 
> -4.000000 -3.453000 23
> -3.453000 -2.906000 1
> -2.906000 -2.359000 5
> -2.359000 -1.812000 5
> -1.812000 -1.265000 5
> -1.265000 -0.718000 13
> -0.718000 -0.171000 21
> -0.171000 0.376000 49
> 0.376000 0.923000 26
> 0.923000 2.017000 13
> 
> where the first column is the lower value for the bin, the 
> second column the upper value for the bin and the last column 
> is the frequency for that bin
> 
> When I call the hist() function I get
> 
> > depv <- scan('depv.txt') # a vector of observed values
> >
> > # the bin boundaries described above
> > breakvals <- read.table('bindata.txt')
> >
> > hist(depvt, breaks=breakvals$V1)
> >
> Error in hist.default(depvt, breaks = br$V1) :
>         some `x' not counted; maybe `breaks' do not span range of `x'
> 
> I get the same error when I specify breakvals$V2. 
> 
> Some of my observed values lie beyond the lower boundary of 
> the last bin (last item of column 1) or below the upper 
> boundary of my first bin (first row of column 2). Is this the 
> reason why this error occurs?
> 
> How should I specify the breaks?
> 
> In addition is there any way I can plot the two histograms on 
> top of each other?
> 
> I have tried using the barplot() function but when it comes 
> to marking the x axis I'm not sure a to how to proceed. What 
> I have been doing is to calculate the mid point of each bin 
> and use that as the label for each column - is this a valid 
> way to represent the histogram? (This way it is easy for me 
> to plot the histograms for the two sets together)
> 
> Thanks
> 
> -------------------------------------------------------------------
> Rajarshi Guha <rxg218 at psu.edu> <http://jijo.cjb.net>
> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
> -------------------------------------------------------------------
> "A fractal is by definition a set for which the Hausdorff 
> Besicovitch dimension strictly exceeds the topological dimension."
> -- Mandelbrot, "The Fractal Geometry of Nature"
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
>




More information about the R-help mailing list