[R] log y 'axis' of histogram
Hadley Wickham
hadley at rice.edu
Mon Aug 30 20:33:45 CEST 2010
>> That doesn't justify the use of a _histogram_ - and regardless of
>
> The usage highlights meaningful characteristics of the data.
> What better justification for any method of analysis and display is
> there?
That you're displaying something that is mathematically well founded
and meaningful - but my emphasis there was on histogram. I don't
think a histogram makes sense, but there are other ways of displaying
the same data that would (e.g. a frequency polygon, or maybe a density
plot)
>> what distributional display you use, logging the counts imposes some
>> pretty heavy restrictions on the shape of the distribution (e.g. that
>> it must not drop to zero).
>
> Does there have to be a recognized statistical distribution to use R?
My point is about the display - if your binned counts look like 1,
100, 1000, 100, 0, 0, 10, 1000, 1000, how do you display the log
counts?
> In my case I am using R for all of the analysis and graphics in a
> new book. This means that sometimes I have to deal with data sets
> that are more or less a jumble of numbers with patterns in a few
> places. For instance, the numeric value of integer constants
> appearing as one operand of the binary bitwise-AND operator (see
> figure 1224.1 of www.knosof.co.uk/cbook/usefigtab.pdf, raw data
> at: www.knosof.co.uk/cbook/bandcons.hist.gz)
>
> qplot(band, binwidth=8, geom="histogram") + scale_y_log()
> does a good job of highlighting the peaks.
I couldn't find that figure, but I'd think geom = "freqpoly" would be
more appropriate. (I'd also suggest adding a bit more space between
the data and the margins in your figures - they overlap in many
plots).
Hadley
--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/
More information about the R-help
mailing list