[R] ggplot2 histograms... a subtle error found

hadley wickham h.wickham at gmail.com
Tue Aug 10 05:04:00 CEST 2010


> When ggplot2 verifies the widths before stacking (the default position for
> histograms), it computes the widths from the minimum and maximum values for
> each bin.  However, because the width of the bins (0.28) is much smaller
> than the scale of the edges (6.8e+09), there is some underflow and the
> widths don't all come out equal:
>
> # in ggplot2::collide
> with(data, xmax-xmin)
> # [1] 0.2799988 0.2799988 0.2800007 0.2799988 0.2799988 0.2799988 #0.2800007
> 0.2799988 0.2799988
> #[10] 0.2799988 0.2800007 0.2799988 0.2799988 0.2799988 0.2800007 #0.2799988
> 0.2799988 0.2800007
> #[19] 0.2799988 0.2799988 0.2799988 0.2800007 0.2799988 0.2799988 #0.2799988
> 0.2800007 0.2799988
> #[28] 0.2799988 0.2799988 0.2800007 0.2799988 0.2799988
>
> unique(with(data, xmax - xmin))
> #[1] 0.2799988 0.2800007
>
> So ggplot2 concludes the widths are not equal and gives the error you see.

Well, what I actually check is length(widths) > 1 && sd(widths) >
1e-6, but in this case sd(widths) is 1.35e-06, just over my threshold.
 I could change this, but that already seems like a fairly
conservative check to me, and I don't know enough about floating point
to be sure of the consequences of raising it further.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/



More information about the R-help mailing list