# [Rd] Determining the break points by hist() leads to errors (PR#2432)

**Peter Dalgaard BSA
**
p.dalgaard@biostat.ku.dk

*Wed Jan 8 19:59:02 2003*

volker.franz@tuebingen.mpg.de writes:
>* Hi,
*>*
*>* if I dermine the break points using the hist() function and then try
*>* to re-use these in a new histogram, R fails. Here is an example of the
*>* problem:
*>*
*>* ##First, plot a histogram:
*>* data(islands)
*>* foo <- hist(islands,freq=T)
*>*
*>* ##Now, try plot it again, with the previously determined break points:
*>* hist(islands,breaks=foo$breaks,freq=T)
*>*
*>* ##... this lead to the warning message:
*>* Warning message:
*>* the AREAS in the plot are wrong -- rather use `freq=FALSE'!
*>* in: plot.histogram(r, freq = freq, col = col, border = border, angle =
*>*
*>* ##The reason for this seems to be, that the breaks are NOT
*>* ##equidistant (despite foo$equidist being TRUE!):
*>*
*>* > foo$breaks
*>* [1] -0.0018 2000.0018 4000.0018 6000.0018 8000.0018 10000.0018
*>* [7] 12000.0018 14000.0018 16000.0018 18000.0018
*>*
*>* ##Correcting this (by changing the first element of foo$breaks):
*>* corr.breaks <- c(+0.0018,2000.0018,4000.0018,6000.0018,8000.0018,
*>* 10000.0018,12000.0018,14000.0018,16000.0018,18000.0018)
*>*
*>* ##...leads to the desired result:
*>* hist(islands,breaks=corr.breaks,freq=T)
*>*
*
...for your data. There's a reason why the first breakpoint is
adjusted in the opposite direction, namely to get exact zeros counted
into the first bin. Of course since x in theory has a continuous
distribution, you in theory don't have observations on the boundary,
but in practice, theory and practice is not the same.
So the proper fix would be different. Currently we have
h <- diff(breaks)
equidist <- !use.br || diff(range(h)) < 1e-07 * mean(h)
which likely needs a larger tolerance since
diddle <- 1e-07 * max(abs(range(breaks)))
goes in both directions.
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907